Generative drug design opens avenues for discovering novel compounds within the vast chemical space rather than conventional screening against limited libraries. However, the practical utility of the generated molecules is frequently constrained, as many designs prioritize a narrow range of pharmacological properties and neglect physical reliability, which hinders the success rate of subsequent wet-laboratory evaluations. Here, to address this, we propose ED2Mol, a deep learning-based approach that leverages fundamental electron density information to improve de novo molecular generation and optimization. The extensive evaluations across multiple benchmarks demonstrate that ED2Mol surpasses existing methods in terms of the generation success rate and >97% physical reliability. It also facilitates automated hit optimization that is not fully implemented by other methods using fragment-based strategies. Furthermore, ED2Mol exhibits generalizability to more challenging, unseen allosteric pocket benchmarks, attaining consistent performance. More importantly, ED2Mol has been applied to various real-world essential targets, successfully identifying wet-laboratory-validated bioactive compounds, ranging from FGFR3 orthosteric inhibitors to CDC42 allosteric inhibitors, GCK and GPRC5A allosteric activators. The directly generated binding modes of these compounds are close to predictions through molecular docking and further validated via the X-ray co-crystal structure. All these results highlight ED2Mol’s potential as a useful tool in drug design with enhanced effectiveness, physical reliability and practical applicability. A deep generative model is developed for de novo molecular design and optimization by leveraging electron density. Wet-laboratory assays validated its reliability to generate diverse bioactive molecules—orthosteric and allosteric, inhibitors and activators.
{"title":"Electron-density-informed effective and reliable de novo molecular design and optimization with ED2Mol","authors":"Mingyu Li, Kun Song, Jixiao He, Mingzhu Zhao, Gengshu You, Jie Zhong, Mengxi Zhao, Arong Li, Yu Chen, Guobin Li, Ying Kong, Jiacheng Wei, Zhaofu Wang, Jiamin Zhou, Hongbing Yang, Shichao Ma, Hailong Zhang, Irakoze Loïca Mélita, Weidong Lin, Yuhang Lu, Zhengtian Yu, Xun Lu, Yujun Zhao, Jian Zhang","doi":"10.1038/s42256-025-01095-7","DOIUrl":"10.1038/s42256-025-01095-7","url":null,"abstract":"Generative drug design opens avenues for discovering novel compounds within the vast chemical space rather than conventional screening against limited libraries. However, the practical utility of the generated molecules is frequently constrained, as many designs prioritize a narrow range of pharmacological properties and neglect physical reliability, which hinders the success rate of subsequent wet-laboratory evaluations. Here, to address this, we propose ED2Mol, a deep learning-based approach that leverages fundamental electron density information to improve de novo molecular generation and optimization. The extensive evaluations across multiple benchmarks demonstrate that ED2Mol surpasses existing methods in terms of the generation success rate and >97% physical reliability. It also facilitates automated hit optimization that is not fully implemented by other methods using fragment-based strategies. Furthermore, ED2Mol exhibits generalizability to more challenging, unseen allosteric pocket benchmarks, attaining consistent performance. More importantly, ED2Mol has been applied to various real-world essential targets, successfully identifying wet-laboratory-validated bioactive compounds, ranging from FGFR3 orthosteric inhibitors to CDC42 allosteric inhibitors, GCK and GPRC5A allosteric activators. The directly generated binding modes of these compounds are close to predictions through molecular docking and further validated via the X-ray co-crystal structure. All these results highlight ED2Mol’s potential as a useful tool in drug design with enhanced effectiveness, physical reliability and practical applicability. A deep generative model is developed for de novo molecular design and optimization by leveraging electron density. Wet-laboratory assays validated its reliability to generate diverse bioactive molecules—orthosteric and allosteric, inhibitors and activators.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 8","pages":"1355-1368"},"PeriodicalIF":23.9,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144901527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-20DOI: 10.1038/s42256-025-01089-5
Eugen Ursu, Aygul Minnegalieva, Puneet Rawat, Maria Chernigovskaya, Robi Tacutu, Geir Kjetil Sandve, Philippe A. Robert, Victor Greiff
Supervised machine learning models depend on training datasets containing positive and negative examples: dataset composition directly impacts model performance and bias. Given the importance of machine learning for immunotherapeutic design, we examined how different negative class definitions affect model generalization and rule discovery for antibody–antigen binding. Using synthetic-structure-based binding data, we evaluated models trained with various definitions of negative sets. Our findings reveal that high out-of-distribution performance can be achieved when the negative dataset contains more similar samples to the positive dataset, despite lower in-distribution performance. Furthermore, by leveraging ground-truth information, we show that binding rules associated with positive data change based on the negative data used. Validation on experimental data supported simulation-based observations. This work underscores the role of dataset composition in creating robust, generalizable and biology-aware sequence-based ML models. Negative data composition critically shapes machine learning robustness in sequence-based biological tasks. Training data composition and its implications are investigated on biological rule discoveries.
{"title":"Training data composition determines machine learning generalization and biological rule discovery","authors":"Eugen Ursu, Aygul Minnegalieva, Puneet Rawat, Maria Chernigovskaya, Robi Tacutu, Geir Kjetil Sandve, Philippe A. Robert, Victor Greiff","doi":"10.1038/s42256-025-01089-5","DOIUrl":"10.1038/s42256-025-01089-5","url":null,"abstract":"Supervised machine learning models depend on training datasets containing positive and negative examples: dataset composition directly impacts model performance and bias. Given the importance of machine learning for immunotherapeutic design, we examined how different negative class definitions affect model generalization and rule discovery for antibody–antigen binding. Using synthetic-structure-based binding data, we evaluated models trained with various definitions of negative sets. Our findings reveal that high out-of-distribution performance can be achieved when the negative dataset contains more similar samples to the positive dataset, despite lower in-distribution performance. Furthermore, by leveraging ground-truth information, we show that binding rules associated with positive data change based on the negative data used. Validation on experimental data supported simulation-based observations. This work underscores the role of dataset composition in creating robust, generalizable and biology-aware sequence-based ML models. Negative data composition critically shapes machine learning robustness in sequence-based biological tasks. Training data composition and its implications are investigated on biological rule discoveries.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 8","pages":"1206-1219"},"PeriodicalIF":23.9,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-20DOI: 10.1038/s42256-025-01080-0
Wesley Ta, Jonathan M. Stokes
Thoughtfully designed negative training datasets may hold the key to more robust machine learning models. Ursu et al. reveal how negative training data composition shapes antibody prediction models and their generalizability. Sometimes, the best way to get better is to train harder.
{"title":"The importance of negative training data for robust antibody binding prediction","authors":"Wesley Ta, Jonathan M. Stokes","doi":"10.1038/s42256-025-01080-0","DOIUrl":"10.1038/s42256-025-01080-0","url":null,"abstract":"Thoughtfully designed negative training datasets may hold the key to more robust machine learning models. Ursu et al. reveal how negative training data composition shapes antibody prediction models and their generalizability. Sometimes, the best way to get better is to train harder.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 8","pages":"1192-1194"},"PeriodicalIF":23.9,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-19DOI: 10.1038/s42256-025-01098-4
Li-Cheng Xu, Miao-Jiong Tang, Junyi An, Fenglei Cao, Yuan Qi
Artificial intelligence has transformed the field of precise organic synthesis. Data-driven methods, including machine learning and deep learning, have shown great promise in predicting reaction performance and synthesis planning. However, the inherent methodological divergence between numerical regression-driven reaction performance prediction and sequence generation-based synthesis planning creates formidable challenges in constructing a unified deep learning architecture. Here we present RXNGraphormer, a framework to jointly address these tasks through a unified pre-training approach. By synergizing graph neural networks for intramolecular pattern recognition with Transformer-based models for intermolecular interaction modelling, and training on 13 million reactions via a carefully designed strategy, RXNGraphormer achieves state-of-the-art performance across eight benchmark datasets for reactivity or selectivity prediction and forward-synthesis or retrosynthesis planning, as well as three external realistic datasets for reactivity and selectivity prediction. Notably, the model generates chemically meaningful embeddings that spontaneously cluster reactions by type without explicit supervision. This work bridges the critical gap between performance prediction and synthesis planning tasks in chemical AI, offering a versatile tool for accurate reaction prediction and synthesis design. Xu et al. present RXNGraphormer, a pre-trained model that learns bond transformation patterns from over 13 million reactions, achieving state-of-the-art accuracy in reaction performance prediction and synthesis planning.
{"title":"A unified pre-trained deep learning framework for cross-task reaction performance prediction and synthesis planning","authors":"Li-Cheng Xu, Miao-Jiong Tang, Junyi An, Fenglei Cao, Yuan Qi","doi":"10.1038/s42256-025-01098-4","DOIUrl":"10.1038/s42256-025-01098-4","url":null,"abstract":"Artificial intelligence has transformed the field of precise organic synthesis. Data-driven methods, including machine learning and deep learning, have shown great promise in predicting reaction performance and synthesis planning. However, the inherent methodological divergence between numerical regression-driven reaction performance prediction and sequence generation-based synthesis planning creates formidable challenges in constructing a unified deep learning architecture. Here we present RXNGraphormer, a framework to jointly address these tasks through a unified pre-training approach. By synergizing graph neural networks for intramolecular pattern recognition with Transformer-based models for intermolecular interaction modelling, and training on 13 million reactions via a carefully designed strategy, RXNGraphormer achieves state-of-the-art performance across eight benchmark datasets for reactivity or selectivity prediction and forward-synthesis or retrosynthesis planning, as well as three external realistic datasets for reactivity and selectivity prediction. Notably, the model generates chemically meaningful embeddings that spontaneously cluster reactions by type without explicit supervision. This work bridges the critical gap between performance prediction and synthesis planning tasks in chemical AI, offering a versatile tool for accurate reaction prediction and synthesis design. Xu et al. present RXNGraphormer, a pre-trained model that learns bond transformation patterns from over 13 million reactions, achieving state-of-the-art accuracy in reaction performance prediction and synthesis planning.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1561-1571"},"PeriodicalIF":23.9,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144898568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-18DOI: 10.1038/s42256-025-01088-6
Haonan Duan, Marta Skreta, Leonardo Cotta, Ella Miray Rajaonson, Nikita Dhawan, Alán Aspuru-Guzik, Chris J. Maddison
Protein language models are trained to predict amino acid sequences from vast protein databases and learn to represent proteins as feature vectors. These vector representations have enabled impressive applications, from predicting mutation effects to protein folding. One of the reasons offered for the success of these models is that conserved sequence motifs tend to be important for protein fitness. Yet, the relationship between sequence conservation and fitness can be confounded by the evolutionary and environmental context. Should we, therefore, look to other data sources that may contain more direct functional information? In this work, we conduct a comprehensive study examining the effects of training protein models to predict 19 types of text annotation from UniProt. Our results show that fine-tuning protein models on a subset of these annotations enhances the models’ predictive capabilities on a variety of function prediction tasks. In particular, when evaluated on our tasks, our model outperforms the basic local alignment search tool, which none of the pretrained protein models accomplished. Our results suggest that a much wider array of data modalities, such as text annotations, may be tapped to improve protein language models. Although protein language models have enabled major advances, they often rely on indirect signals that may not fully capture functional relevance. Fine-tuning these models on textual annotations is shown to improve their performance on function prediction tasks.
{"title":"Boosting the predictive power of protein representations with a corpus of text annotations","authors":"Haonan Duan, Marta Skreta, Leonardo Cotta, Ella Miray Rajaonson, Nikita Dhawan, Alán Aspuru-Guzik, Chris J. Maddison","doi":"10.1038/s42256-025-01088-6","DOIUrl":"10.1038/s42256-025-01088-6","url":null,"abstract":"Protein language models are trained to predict amino acid sequences from vast protein databases and learn to represent proteins as feature vectors. These vector representations have enabled impressive applications, from predicting mutation effects to protein folding. One of the reasons offered for the success of these models is that conserved sequence motifs tend to be important for protein fitness. Yet, the relationship between sequence conservation and fitness can be confounded by the evolutionary and environmental context. Should we, therefore, look to other data sources that may contain more direct functional information? In this work, we conduct a comprehensive study examining the effects of training protein models to predict 19 types of text annotation from UniProt. Our results show that fine-tuning protein models on a subset of these annotations enhances the models’ predictive capabilities on a variety of function prediction tasks. In particular, when evaluated on our tasks, our model outperforms the basic local alignment search tool, which none of the pretrained protein models accomplished. Our results suggest that a much wider array of data modalities, such as text annotations, may be tapped to improve protein language models. Although protein language models have enabled major advances, they often rely on indirect signals that may not fully capture functional relevance. Fine-tuning these models on textual annotations is shown to improve their performance on function prediction tasks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1403-1413"},"PeriodicalIF":23.9,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145129515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-18DOI: 10.1038/s42256-025-01092-w
Takuya Ito, Murray Campbell, Lior Horesh, Tim Klinger, Parikshit Ram
The rapid development of artificial intelligence (AI) systems has created an urgent need for their scientific quantification. While their fluency across a variety of domains is impressive, AI systems fall short on tests requiring algorithmic reasoning—a glaring limitation, given the necessity for interpretable and reliable technology. Despite a surge in reasoning benchmarks emerging from the academic community, no theoretical framework exists to quantify algorithmic reasoning in AI systems. Here we adopt a framework from computational complexity theory to quantify algorithmic generalization using algebraic expressions: algebraic circuit complexity. Algebraic circuit complexity theory—the study of algebraic expressions as circuit models—is a natural framework for studying the complexity of algorithmic computation. Algebraic circuit complexity enables the study of generalization by defining benchmarks in terms of the computational requirements for solving a problem. Moreover, algebraic circuits are generic mathematical objects; an arbitrarily large number of samples can be generated for a specified circuit, making it an ideal experimental sandbox for the data-hungry models that are used today. In this Perspective, we adopt tools from algebraic circuit complexity, apply them to formalize a science of algorithmic generalization, and address key challenges for its successful application to AI science. Despite impressive performances of current large AI models, symbolic and abstract reasoning tasks often elicit failure modes in these systems. In this Perspective, Ito et al. propose to make use of computational complexity theory, formulating algebraic problems as computable circuits to address the challenge of mathematical and symbolic reasoning in AI systems.
{"title":"Quantifying artificial intelligence through algorithmic generalization","authors":"Takuya Ito, Murray Campbell, Lior Horesh, Tim Klinger, Parikshit Ram","doi":"10.1038/s42256-025-01092-w","DOIUrl":"10.1038/s42256-025-01092-w","url":null,"abstract":"The rapid development of artificial intelligence (AI) systems has created an urgent need for their scientific quantification. While their fluency across a variety of domains is impressive, AI systems fall short on tests requiring algorithmic reasoning—a glaring limitation, given the necessity for interpretable and reliable technology. Despite a surge in reasoning benchmarks emerging from the academic community, no theoretical framework exists to quantify algorithmic reasoning in AI systems. Here we adopt a framework from computational complexity theory to quantify algorithmic generalization using algebraic expressions: algebraic circuit complexity. Algebraic circuit complexity theory—the study of algebraic expressions as circuit models—is a natural framework for studying the complexity of algorithmic computation. Algebraic circuit complexity enables the study of generalization by defining benchmarks in terms of the computational requirements for solving a problem. Moreover, algebraic circuits are generic mathematical objects; an arbitrarily large number of samples can be generated for a specified circuit, making it an ideal experimental sandbox for the data-hungry models that are used today. In this Perspective, we adopt tools from algebraic circuit complexity, apply them to formalize a science of algorithmic generalization, and address key challenges for its successful application to AI science. Despite impressive performances of current large AI models, symbolic and abstract reasoning tasks often elicit failure modes in these systems. In this Perspective, Ito et al. propose to make use of computational complexity theory, formulating algebraic problems as computable circuits to address the challenge of mathematical and symbolic reasoning in AI systems.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 8","pages":"1195-1205"},"PeriodicalIF":23.9,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145123738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-15DOI: 10.1038/s42256-025-01091-x
Raúl Miñán, Javier Gallardo, Álvaro Ciudad, Alexis Molina
Molecular docking plays a crucial role in structure-based drug discovery, enabling the prediction of how small molecules interact with protein targets. Traditional docking methods rely on scoring functions and search heuristics, whereas recent generative approaches, such as DiffDock, leverage deep learning for pose prediction. However, blind-diffusion-based docking often struggles with binding site localization and pose accuracy, particularly in complex protein–ligand systems. This work introduces GeoDirDock (GDD), a guided diffusion approach to molecular docking that enhances the accuracy and physical plausibility of ligand docking predictions. GDD guides the denoising process of a diffusion model along geodesic paths within multiple spaces representing translational, rotational and torsional degrees of freedom. Our method leverages expert knowledge to direct the generative modelling process, specifically targeting desired protein–ligand interaction regions. We demonstrate that GDD outperforms existing blind docking methods in terms of root mean squared distance accuracy and physicochemical pose realism. Our results indicate that incorporating domain expertise into the diffusion process leads to more biologically relevant docking predictions. Additionally, we explore the potential of GDD as a template-based modelling tool for lead optimization in drug discovery through angle transfer in maximum common substructure docking, showcasing its capability to accurately predict ligand orientations for chemically similar compounds. Future applications in real-world drug discovery campaigns will naturally continue to refine and extend the utility of prior-informed diffusion docking methods. GeoDirDock is a framework that guides the denoising process of a generative diffusion docking model along geodesic paths within multiple spaces representing translational, rotational and torsional degrees of freedom. This approach enhances the accuracy and physical plausibility of ligand docking predictions.
{"title":"Informed protein–ligand docking via geodesic guidance in translational, rotational and torsional spaces","authors":"Raúl Miñán, Javier Gallardo, Álvaro Ciudad, Alexis Molina","doi":"10.1038/s42256-025-01091-x","DOIUrl":"10.1038/s42256-025-01091-x","url":null,"abstract":"Molecular docking plays a crucial role in structure-based drug discovery, enabling the prediction of how small molecules interact with protein targets. Traditional docking methods rely on scoring functions and search heuristics, whereas recent generative approaches, such as DiffDock, leverage deep learning for pose prediction. However, blind-diffusion-based docking often struggles with binding site localization and pose accuracy, particularly in complex protein–ligand systems. This work introduces GeoDirDock (GDD), a guided diffusion approach to molecular docking that enhances the accuracy and physical plausibility of ligand docking predictions. GDD guides the denoising process of a diffusion model along geodesic paths within multiple spaces representing translational, rotational and torsional degrees of freedom. Our method leverages expert knowledge to direct the generative modelling process, specifically targeting desired protein–ligand interaction regions. We demonstrate that GDD outperforms existing blind docking methods in terms of root mean squared distance accuracy and physicochemical pose realism. Our results indicate that incorporating domain expertise into the diffusion process leads to more biologically relevant docking predictions. Additionally, we explore the potential of GDD as a template-based modelling tool for lead optimization in drug discovery through angle transfer in maximum common substructure docking, showcasing its capability to accurately predict ligand orientations for chemically similar compounds. Future applications in real-world drug discovery campaigns will naturally continue to refine and extend the utility of prior-informed diffusion docking methods. GeoDirDock is a framework that guides the denoising process of a generative diffusion docking model along geodesic paths within multiple spaces representing translational, rotational and torsional degrees of freedom. This approach enhances the accuracy and physical plausibility of ligand docking predictions.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1555-1560"},"PeriodicalIF":23.9,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144851536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14DOI: 10.1038/s42256-025-01084-w
Maximilian Dreyer, Jim Berend, Tobias Labarta, Johanna Vielhaben, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek
Unlike human-engineered systems, such as aeroplanes, for which the role and dependencies of each component are well understood, the inner workings of artificial intelligence models remain largely opaque, which hinders verifiability and undermines trust. Current approaches to neural network interpretability, including input attribution methods, probe-based analysis and activation visualization techniques, typically provide limited insights about the role of individual components or require extensive manual interpretation that cannot scale with model complexity. This paper introduces SemanticLens, a universal explanation method for neural networks that maps hidden knowledge encoded by components (for example, individual neurons) into the semantically structured, multimodal space of a foundation model such as CLIP. In this space, unique operations become possible, including (1) textual searches to identify neurons encoding specific concepts, (2) systematic analysis and comparison of model representations, (3) automated labelling of neurons and explanation of their functional roles, and (4) audits to validate decision-making against requirements. Fully scalable and operating without human input, SemanticLens is shown to be effective for debugging and validation, summarizing model knowledge, aligning reasoning with expectations (for example, adherence to the ABCDE rule in melanoma classification) and detecting components tied to spurious correlations and their associated training data. By enabling component-level understanding and validation, the proposed approach helps mitigate the opacity that limits confidence in artificial intelligence systems compared to traditional engineered systems, enabling more reliable deployment in critical applications. SemanticLens is a tool that embeds artificial intelligence model components (such as neurons) into a searchable, human-understandable space. This enables automated auditing, validation of decisions and detection of problematic behaviours with minimal human oversight.
{"title":"Mechanistic understanding and validation of large AI models with SemanticLens","authors":"Maximilian Dreyer, Jim Berend, Tobias Labarta, Johanna Vielhaben, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek","doi":"10.1038/s42256-025-01084-w","DOIUrl":"10.1038/s42256-025-01084-w","url":null,"abstract":"Unlike human-engineered systems, such as aeroplanes, for which the role and dependencies of each component are well understood, the inner workings of artificial intelligence models remain largely opaque, which hinders verifiability and undermines trust. Current approaches to neural network interpretability, including input attribution methods, probe-based analysis and activation visualization techniques, typically provide limited insights about the role of individual components or require extensive manual interpretation that cannot scale with model complexity. This paper introduces SemanticLens, a universal explanation method for neural networks that maps hidden knowledge encoded by components (for example, individual neurons) into the semantically structured, multimodal space of a foundation model such as CLIP. In this space, unique operations become possible, including (1) textual searches to identify neurons encoding specific concepts, (2) systematic analysis and comparison of model representations, (3) automated labelling of neurons and explanation of their functional roles, and (4) audits to validate decision-making against requirements. Fully scalable and operating without human input, SemanticLens is shown to be effective for debugging and validation, summarizing model knowledge, aligning reasoning with expectations (for example, adherence to the ABCDE rule in melanoma classification) and detecting components tied to spurious correlations and their associated training data. By enabling component-level understanding and validation, the proposed approach helps mitigate the opacity that limits confidence in artificial intelligence systems compared to traditional engineered systems, enabling more reliable deployment in critical applications. SemanticLens is a tool that embeds artificial intelligence model components (such as neurons) into a searchable, human-understandable space. This enables automated auditing, validation of decisions and detection of problematic behaviours with minimal human oversight.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1572-1585"},"PeriodicalIF":23.9,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01084-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144840346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-11DOI: 10.1038/s42256-025-01085-9
Yuepeng Jiang, Pingping Zhang, Miaozhe Huo, Shuai Cheng Li
T cell selection is a vital process in which precursor T cells mature into functional cells. Accurately modelling and quantifying T cell selection utilizing high-throughput T cell receptor (TCR) sequencing data presents an important computational challenge in immunology. Statistical modelling of TCR repertoires allows the assessment of selection force through the selection factor that bridges the pre- and post-selection distributions. Current tools derive the principles underlying this selection factor through weakly supervised learning, limiting the effective use of available data. To overcome this shortcoming, we introduce TCRsep, a deep learning framework designed to directly learn the selection factor in a supervised training context. The performance and advantage of TCRsep were extensively validated across various scenarios using both simulated and real datasets. By applying TCRsep to over 1,500 repertoire samples, we elucidate the correlation between selection and repertoire diversities in aging, explore the stability and individuality of selection over short time frames, investigate the role of selection in defining TCR sharing profiles and demonstrate its efficiency in identifying candidate-disease-associated TCRs based on their sharing profiles. In particular, these identified TCRs were further utilized for diagnosing cytomegalovirus infection, achieving high predictive accuracy. In conclusion, TCRsep substantially improves the selection factor prediction and serves as a valuable discovery tool for clinical applications. TCRsep, a deep learning model for predicting selection factors that quantifies the T cell selection process, is introduced. Also, various benchmarks are designed to evaluate the selection models, demonstrating that TCRsep outperforms state-of-the-art models.
{"title":"Deep learning-based prediction of the selection factors for quantifying selection in immune receptor repertoires","authors":"Yuepeng Jiang, Pingping Zhang, Miaozhe Huo, Shuai Cheng Li","doi":"10.1038/s42256-025-01085-9","DOIUrl":"10.1038/s42256-025-01085-9","url":null,"abstract":"T cell selection is a vital process in which precursor T cells mature into functional cells. Accurately modelling and quantifying T cell selection utilizing high-throughput T cell receptor (TCR) sequencing data presents an important computational challenge in immunology. Statistical modelling of TCR repertoires allows the assessment of selection force through the selection factor that bridges the pre- and post-selection distributions. Current tools derive the principles underlying this selection factor through weakly supervised learning, limiting the effective use of available data. To overcome this shortcoming, we introduce TCRsep, a deep learning framework designed to directly learn the selection factor in a supervised training context. The performance and advantage of TCRsep were extensively validated across various scenarios using both simulated and real datasets. By applying TCRsep to over 1,500 repertoire samples, we elucidate the correlation between selection and repertoire diversities in aging, explore the stability and individuality of selection over short time frames, investigate the role of selection in defining TCR sharing profiles and demonstrate its efficiency in identifying candidate-disease-associated TCRs based on their sharing profiles. In particular, these identified TCRs were further utilized for diagnosing cytomegalovirus infection, achieving high predictive accuracy. In conclusion, TCRsep substantially improves the selection factor prediction and serves as a valuable discovery tool for clinical applications. TCRsep, a deep learning model for predicting selection factors that quantifies the T cell selection process, is introduced. Also, various benchmarks are designed to evaluate the selection models, demonstrating that TCRsep outperforms state-of-the-art models.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 8","pages":"1331-1345"},"PeriodicalIF":23.9,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144819971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph neural networks (GNNs) have shown remarkable success in molecular property prediction as key models in geometric deep learning. Meanwhile, Kolmogorov–Arnold networks (KANs) have emerged as powerful alternatives to multi-layer perceptrons, offering improved expressivity, parameter efficiency and interpretability. To combine the strengths of both frameworks, we propose Kolmogorov–Arnold GNNs (KA-GNNs), which integrate KAN modules into the three fundamental components of GNNs: node embedding, message passing and readout. We further introduce Fourier-series-based univariate functions within KAN to enhance function approximation and provide theoretical analysis to support their expressiveness. Two architectural variants, KA-graph convolutional networks and KA-augmented graph attention networks, are developed and evaluated across seven molecular benchmarks. Experimental results show that KA-GNNs consistently outperform conventional GNNs in terms of both prediction accuracy and computational efficiency. Moreover, our models exhibit improved interpretability by highlighting chemically meaningful substructures. These findings demonstrate that KA-GNNs offer a powerful and generalizable framework for molecular data modelling, drug discovery and beyond. Li et al. developed KA-GNNs, graph neural network architectures enhanced by Kolmogorov–Arnold networks, which improve accuracy and interpretability in molecular property prediction and extend geometric deep learning to scientific domains.
{"title":"Kolmogorov–Arnold graph neural networks for molecular property prediction","authors":"Longlong Li, Yipeng Zhang, Guanghui Wang, Kelin Xia","doi":"10.1038/s42256-025-01087-7","DOIUrl":"10.1038/s42256-025-01087-7","url":null,"abstract":"Graph neural networks (GNNs) have shown remarkable success in molecular property prediction as key models in geometric deep learning. Meanwhile, Kolmogorov–Arnold networks (KANs) have emerged as powerful alternatives to multi-layer perceptrons, offering improved expressivity, parameter efficiency and interpretability. To combine the strengths of both frameworks, we propose Kolmogorov–Arnold GNNs (KA-GNNs), which integrate KAN modules into the three fundamental components of GNNs: node embedding, message passing and readout. We further introduce Fourier-series-based univariate functions within KAN to enhance function approximation and provide theoretical analysis to support their expressiveness. Two architectural variants, KA-graph convolutional networks and KA-augmented graph attention networks, are developed and evaluated across seven molecular benchmarks. Experimental results show that KA-GNNs consistently outperform conventional GNNs in terms of both prediction accuracy and computational efficiency. Moreover, our models exhibit improved interpretability by highlighting chemically meaningful substructures. These findings demonstrate that KA-GNNs offer a powerful and generalizable framework for molecular data modelling, drug discovery and beyond. Li et al. developed KA-GNNs, graph neural network architectures enhanced by Kolmogorov–Arnold networks, which improve accuracy and interpretability in molecular property prediction and extend geometric deep learning to scientific domains.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 8","pages":"1346-1354"},"PeriodicalIF":23.9,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01087-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144819239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}