Pub Date : 2025-09-22DOI: 10.1038/s42256-025-01086-8
Winston Chen, Yifan Jiang, William Stafford Noble, Yang Young Lu
Machine learning (ML) models are powerful tools for detecting complex patterns, yet their ‘black-box’ nature limits their interpretability, hindering their use in critical domains like healthcare and finance. Interpretable ML methods aim to explain how features influence model predictions but often focus on univariate feature importance, overlooking complex feature interactions. Although recent efforts extend interpretability to feature interactions, existing approaches struggle with robustness and error control, especially under data perturbations. In this study, we introduce Diamond, a method for trustworthy feature interaction discovery. Diamond uniquely integrates the model-X knockoffs framework to control the false discovery rate, ensuring a low proportion of falsely detected interactions. Diamond includes a non-additivity distillation procedure that refines existing interaction importance measures to isolate non-additive interaction effects and preserve false discovery rate control. This approach addresses the limitations of off-the-shelf interaction measures, which, when used naively, can lead to inaccurate discoveries. Diamond’s applicability spans a broad class of ML models, including deep neural networks, transformers, tree-based models and factorization-based models. Empirical evaluations on both simulated and real datasets across various biomedical studies demonstrate its utility in enabling reliable data-driven scientific discoveries. Diamond represents a significant step forward in leveraging ML for scientific innovation and hypothesis generation. Diamond, a statistically rigorous method, is capable of finding meaningful feature interactions within machine learning models, making black-box models more interpretable for science and medicine.
{"title":"Error-controlled non-additive interaction discovery in machine learning models","authors":"Winston Chen, Yifan Jiang, William Stafford Noble, Yang Young Lu","doi":"10.1038/s42256-025-01086-8","DOIUrl":"10.1038/s42256-025-01086-8","url":null,"abstract":"Machine learning (ML) models are powerful tools for detecting complex patterns, yet their ‘black-box’ nature limits their interpretability, hindering their use in critical domains like healthcare and finance. Interpretable ML methods aim to explain how features influence model predictions but often focus on univariate feature importance, overlooking complex feature interactions. Although recent efforts extend interpretability to feature interactions, existing approaches struggle with robustness and error control, especially under data perturbations. In this study, we introduce Diamond, a method for trustworthy feature interaction discovery. Diamond uniquely integrates the model-X knockoffs framework to control the false discovery rate, ensuring a low proportion of falsely detected interactions. Diamond includes a non-additivity distillation procedure that refines existing interaction importance measures to isolate non-additive interaction effects and preserve false discovery rate control. This approach addresses the limitations of off-the-shelf interaction measures, which, when used naively, can lead to inaccurate discoveries. Diamond’s applicability spans a broad class of ML models, including deep neural networks, transformers, tree-based models and factorization-based models. Empirical evaluations on both simulated and real datasets across various biomedical studies demonstrate its utility in enabling reliable data-driven scientific discoveries. Diamond represents a significant step forward in leveraging ML for scientific innovation and hypothesis generation. Diamond, a statistically rigorous method, is capable of finding meaningful feature interactions within machine learning models, making black-box models more interpretable for science and medicine.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1541-1554"},"PeriodicalIF":23.9,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01086-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145129499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-18DOI: 10.1038/s42256-025-01104-9
Fotios Drakopoulos, Lloyd Pellatt, Shievanie Sabesan, Yiqing Xia, Andreas Fragner, Nicholas A. Lesica
Computational models of auditory processing can be valuable tools for research and technology development. Models of the cochlea are highly accurate and widely used, but models of the auditory brain lag far behind in both performance and penetration. Here we present ICNet, a convolutional encoder–decoder model of neural coding in the inferior colliculus. We developed ICNet using large-scale intracranial recordings from anaesthetized gerbils, addressing three key modelling challenges that are common across all sensory systems: capturing the full statistical structure of neuronal response patterns; accounting for physiological and experimental non-stationarity; and extracting features of sensory processing that are shared across different brains. ICNet provides highly accurate simulation of multi-unit neural responses to a wide range of complex sounds, including near-perfect responses to speech. It also reproduces key neurophysiological phenomena such as forward masking and dynamic range adaptation. ICNet can be used to simulate activity from thousands of neural units or to provide a compact representation of early central auditory processing through its latent dynamics, facilitating a wide range of hearing and audio applications. It can also serve as a foundation core, providing a baseline neural representation for models of active listening or higher-level auditory processing. Drakopoulos et al. present a model that captures the transformation from sound waves to neural activity patterns underlying early auditory processing. The model reproduces neural responses to a range of complex sounds and key neurophysiological phenomena.
{"title":"Modelling neural coding in the auditory midbrain with high resolution and accuracy","authors":"Fotios Drakopoulos, Lloyd Pellatt, Shievanie Sabesan, Yiqing Xia, Andreas Fragner, Nicholas A. Lesica","doi":"10.1038/s42256-025-01104-9","DOIUrl":"10.1038/s42256-025-01104-9","url":null,"abstract":"Computational models of auditory processing can be valuable tools for research and technology development. Models of the cochlea are highly accurate and widely used, but models of the auditory brain lag far behind in both performance and penetration. Here we present ICNet, a convolutional encoder–decoder model of neural coding in the inferior colliculus. We developed ICNet using large-scale intracranial recordings from anaesthetized gerbils, addressing three key modelling challenges that are common across all sensory systems: capturing the full statistical structure of neuronal response patterns; accounting for physiological and experimental non-stationarity; and extracting features of sensory processing that are shared across different brains. ICNet provides highly accurate simulation of multi-unit neural responses to a wide range of complex sounds, including near-perfect responses to speech. It also reproduces key neurophysiological phenomena such as forward masking and dynamic range adaptation. ICNet can be used to simulate activity from thousands of neural units or to provide a compact representation of early central auditory processing through its latent dynamics, facilitating a wide range of hearing and audio applications. It can also serve as a foundation core, providing a baseline neural representation for models of active listening or higher-level auditory processing. Drakopoulos et al. present a model that captures the transformation from sound waves to neural activity patterns underlying early auditory processing. The model reproduces neural responses to a range of complex sounds and key neurophysiological phenomena.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1478-1493"},"PeriodicalIF":23.9,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01104-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145129516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-17DOI: 10.1038/s42256-025-01128-1
Yuepeng Jiang, Pingping Zhang, Miaozhe Huo, Shuai Cheng Li
{"title":"Author Correction: Deep learning-based prediction of the selection factors for quantifying selection in immune receptor repertoires","authors":"Yuepeng Jiang, Pingping Zhang, Miaozhe Huo, Shuai Cheng Li","doi":"10.1038/s42256-025-01128-1","DOIUrl":"10.1038/s42256-025-01128-1","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1587-1587"},"PeriodicalIF":23.9,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01128-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145129498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-15DOI: 10.1038/s42256-025-01109-4
Filip Ilievski, Barbara Hammer, Frank van Harmelen, Benjamin Paassen, Sascha Saralajew, Ute Schmid, Michael Biehl, Marianna Bolognesi, Xin Luna Dong, Kiril Gashteovski, Pascal Hitzler, Giuseppe Marra, Pasquale Minervini, Martin Mundt, Axel-Cyrille Ngonga Ngomo, Alessandro Oltramari, Gabriella Pasi, Zeynep G. Saribatur, Luciano Serafini, John Shawe-Taylor, Vered Shwartz, Gabriella Skitalinskaya, Clemens Stachl, Gido M. van de Ven, Thomas Villmann
Recent advances in artificial intelligence (AI)—including generative approaches—have resulted in technology that can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals. The responsible use of AI and its participation in human–AI teams increasingly shows the need for AI alignment, that is, to make AI systems act according to our preferences. A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalize. In cognitive science, human generalization commonly involves abstraction and concept learning. By contrast, AI generalization encompasses out-of-domain generalization in machine learning, rule-based reasoning in symbolic AI, and abstraction in neurosymbolic AI. Here we combine insights from AI and cognitive science to identify key commonalities and differences across three dimensions: notions of, methods for, and evaluation of generalization. We map the different conceptualizations of generalization in AI and cognitive science along these three dimensions and consider their role for alignment in human–AI teaming. This results in interdisciplinary challenges across AI and cognitive science that must be tackled to support effective and cognitively supported alignment in human–AI teaming scenarios. Ilievski et al. examine differences and similarities in the various ways human and AI systems generalize. The insights are important for effectively supporting alignment in human–AI teams.
{"title":"Aligning generalization between humans and machines","authors":"Filip Ilievski, Barbara Hammer, Frank van Harmelen, Benjamin Paassen, Sascha Saralajew, Ute Schmid, Michael Biehl, Marianna Bolognesi, Xin Luna Dong, Kiril Gashteovski, Pascal Hitzler, Giuseppe Marra, Pasquale Minervini, Martin Mundt, Axel-Cyrille Ngonga Ngomo, Alessandro Oltramari, Gabriella Pasi, Zeynep G. Saribatur, Luciano Serafini, John Shawe-Taylor, Vered Shwartz, Gabriella Skitalinskaya, Clemens Stachl, Gido M. van de Ven, Thomas Villmann","doi":"10.1038/s42256-025-01109-4","DOIUrl":"10.1038/s42256-025-01109-4","url":null,"abstract":"Recent advances in artificial intelligence (AI)—including generative approaches—have resulted in technology that can support humans in scientific discovery and forming decisions, but may also disrupt democracies and target individuals. The responsible use of AI and its participation in human–AI teams increasingly shows the need for AI alignment, that is, to make AI systems act according to our preferences. A crucial yet often overlooked aspect of these interactions is the different ways in which humans and machines generalize. In cognitive science, human generalization commonly involves abstraction and concept learning. By contrast, AI generalization encompasses out-of-domain generalization in machine learning, rule-based reasoning in symbolic AI, and abstraction in neurosymbolic AI. Here we combine insights from AI and cognitive science to identify key commonalities and differences across three dimensions: notions of, methods for, and evaluation of generalization. We map the different conceptualizations of generalization in AI and cognitive science along these three dimensions and consider their role for alignment in human–AI teaming. This results in interdisciplinary challenges across AI and cognitive science that must be tackled to support effective and cognitively supported alignment in human–AI teaming scenarios. Ilievski et al. examine differences and similarities in the various ways human and AI systems generalize. The insights are important for effectively supporting alignment in human–AI teams.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1378-1389"},"PeriodicalIF":23.9,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145059570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-15DOI: 10.1038/s42256-025-01082-y
Atabey Ünlü, Elif Çevrim, Melih Gökay Yiğit, Ahmet Sarıgün, Hayriye Çelikbilek, Osman Bayram, Deniz Cansen Kahraman, Abdurrahman Olğaç, Ahmet Sureyya Rifaioglu, Erden Banoğlu, Tunca Doğan
Discovering novel drug candidate molecules is a fundamental step in drug development. Generative deep learning models can sample new molecular structures from learned probability distributions; however, their practical use in drug discovery hinges on generating compounds tailored to a specific target molecule. Here we introduce DrugGEN, an end-to-end generative system for the de novo design of drug candidate molecules that interact with a selected protein. The proposed method represents molecules as graphs and processes them using a generative adversarial network that comprises graph transformer layers. Trained on large datasets of drug-like compounds and target-specific bioactive molecules, DrugGEN designed candidate inhibitors for AKT1, a kinase crucial in many cancers. Docking and molecular dynamics simulations suggest that the generated compounds effectively bind to AKT1, and attention maps provide insights into the model’s reasoning. Furthermore, selected de novo molecules were synthesized and shown to inhibit AKT1 at low micromolar concentrations in the context of in vitro enzymatic assays. These results demonstrate the potential of DrugGEN for designing target-specific molecules. Using the open-access DrugGEN codebase, researchers can retrain the model for other druggable proteins, provided a dataset of known bioactive molecules is available. Inhibiting AKT1 kinase can have potentially positive uses against many types of cancer. To find novel molecules targeting this protein, a graph adversarial network is trained as a generative model.
{"title":"Target-specific de novo design of drug candidate molecules with graph-transformer-based generative adversarial networks","authors":"Atabey Ünlü, Elif Çevrim, Melih Gökay Yiğit, Ahmet Sarıgün, Hayriye Çelikbilek, Osman Bayram, Deniz Cansen Kahraman, Abdurrahman Olğaç, Ahmet Sureyya Rifaioglu, Erden Banoğlu, Tunca Doğan","doi":"10.1038/s42256-025-01082-y","DOIUrl":"10.1038/s42256-025-01082-y","url":null,"abstract":"Discovering novel drug candidate molecules is a fundamental step in drug development. Generative deep learning models can sample new molecular structures from learned probability distributions; however, their practical use in drug discovery hinges on generating compounds tailored to a specific target molecule. Here we introduce DrugGEN, an end-to-end generative system for the de novo design of drug candidate molecules that interact with a selected protein. The proposed method represents molecules as graphs and processes them using a generative adversarial network that comprises graph transformer layers. Trained on large datasets of drug-like compounds and target-specific bioactive molecules, DrugGEN designed candidate inhibitors for AKT1, a kinase crucial in many cancers. Docking and molecular dynamics simulations suggest that the generated compounds effectively bind to AKT1, and attention maps provide insights into the model’s reasoning. Furthermore, selected de novo molecules were synthesized and shown to inhibit AKT1 at low micromolar concentrations in the context of in vitro enzymatic assays. These results demonstrate the potential of DrugGEN for designing target-specific molecules. Using the open-access DrugGEN codebase, researchers can retrain the model for other druggable proteins, provided a dataset of known bioactive molecules is available. Inhibiting AKT1 kinase can have potentially positive uses against many types of cancer. To find novel molecules targeting this protein, a graph adversarial network is trained as a generative model.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1524-1540"},"PeriodicalIF":23.9,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145059571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-10DOI: 10.1038/s42256-025-01112-9
Dehua Peng, Zhipeng Gui, Wenzhang Wei, Fa Li, Jie Gui, Huayi Wu, Jianya Gong
As a pivotal branch of machine learning, manifold learning uncovers the intrinsic low-dimensional structure within complex non-linear manifolds in high-dimensional space for visualization, classification, clustering and gaining key insights. Although existing techniques have achieved remarkable successes, they suffer from extensive distortions of cluster structure, which hinders the understanding of underlying patterns. Scalability issues also limit their applicability for handling large-scale data. Here we propose a sampling-based scalable manifold learning technique that enables uniform and discriminative embedding (SUDE) for large-scale and high-dimensional data. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data and then incorporates the non-landmarks into the learned space by constrained locally linear embedding. We empirically validated the effectiveness of SUDE on synthetic datasets and real-world benchmarks and applied it to analyse single-cell data and detect anomalies in electrocardiogram signals. SUDE exhibits a distinct advantage in scalability with respect to data size and embedding dimension and shows promising performance in cluster separation, integrity and global structure preservation. The experiments also demonstrate notable robustness in embedding quality as the sampling rate decreases. A sampling-based manifold learning method is proposed to study the cluster structure of high-dimensional data. Its applicability and scalability have been verified in single-cell data analysis and anomaly detection in electrocardiogram signals.
{"title":"Sampling-enabled scalable manifold learning unveils the discriminative cluster structure of high-dimensional data","authors":"Dehua Peng, Zhipeng Gui, Wenzhang Wei, Fa Li, Jie Gui, Huayi Wu, Jianya Gong","doi":"10.1038/s42256-025-01112-9","DOIUrl":"10.1038/s42256-025-01112-9","url":null,"abstract":"As a pivotal branch of machine learning, manifold learning uncovers the intrinsic low-dimensional structure within complex non-linear manifolds in high-dimensional space for visualization, classification, clustering and gaining key insights. Although existing techniques have achieved remarkable successes, they suffer from extensive distortions of cluster structure, which hinders the understanding of underlying patterns. Scalability issues also limit their applicability for handling large-scale data. Here we propose a sampling-based scalable manifold learning technique that enables uniform and discriminative embedding (SUDE) for large-scale and high-dimensional data. It starts by seeking a set of landmarks to construct the low-dimensional skeleton of the entire data and then incorporates the non-landmarks into the learned space by constrained locally linear embedding. We empirically validated the effectiveness of SUDE on synthetic datasets and real-world benchmarks and applied it to analyse single-cell data and detect anomalies in electrocardiogram signals. SUDE exhibits a distinct advantage in scalability with respect to data size and embedding dimension and shows promising performance in cluster separation, integrity and global structure preservation. The experiments also demonstrate notable robustness in embedding quality as the sampling rate decreases. A sampling-based manifold learning method is proposed to study the cluster structure of high-dimensional data. Its applicability and scalability have been verified in single-cell data analysis and anomaly detection in electrocardiogram signals.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1669-1684"},"PeriodicalIF":23.9,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145025291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-10DOI: 10.1038/s42256-025-01110-x
Hongliang Xin, John R. Kitchin, Heather J. Kulik
Artificial intelligence is transforming scientific discovery through (semi-)autonomous agents capable of reasoning, planning, and interacting with digital and physical environments. This Comment explores the foundations and frontiers of agentic science, outlining its emerging directions, current limitations, and the pathways for responsible integration into scientific practice.
{"title":"Towards agentic science for advancing scientific discovery","authors":"Hongliang Xin, John R. Kitchin, Heather J. Kulik","doi":"10.1038/s42256-025-01110-x","DOIUrl":"10.1038/s42256-025-01110-x","url":null,"abstract":"Artificial intelligence is transforming scientific discovery through (semi-)autonomous agents capable of reasoning, planning, and interacting with digital and physical environments. This Comment explores the foundations and frontiers of agentic science, outlining its emerging directions, current limitations, and the pathways for responsible integration into scientific practice.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1373-1375"},"PeriodicalIF":23.9,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145025286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-08DOI: 10.1038/s42256-025-01102-x
Ana Laura Dias, Tiago Rodrigues
The next major challenge for artificial intelligence in drug development lies in proving its value in real-world settings. A new technology not only supports the generation of novel chemical entities but also accelerates a range of real-world molecular design tasks.
{"title":"Real-world validation of a structure-aware pipeline for molecular design","authors":"Ana Laura Dias, Tiago Rodrigues","doi":"10.1038/s42256-025-01102-x","DOIUrl":"10.1038/s42256-025-01102-x","url":null,"abstract":"The next major challenge for artificial intelligence in drug development lies in proving its value in real-world settings. A new technology not only supports the generation of novel chemical entities but also accelerates a range of real-world molecular design tasks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1376-1377"},"PeriodicalIF":23.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145009022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-08DOI: 10.1038/s42256-025-01096-6
Dhuvarakesh Karthikeyan, Sarah N. Bennett, Amy G. Reynolds, Benjamin G. Vincent, Alex Rubinsteyn
Despite recent advances in T cell receptor (TCR) engineering, designing functional TCRs against arbitrary targets remains challenging due to complex rules governing cross-reactivity and limited paired data. Here we present TCR-TRANSLATE, a sequence-to-sequence framework that adapts low-resource machine translation techniques to generate antigen-specific TCR sequences against unseen epitopes. By evaluating 12 model variants of the BART and T5 model architectures, we identified key factors affecting performance and utility, revealing discordances between these objectives. Our flagship model, TCRT5, outperforms existing approaches on computational benchmarks, prioritizing functionally relevant sequences at higher ranks. Most significantly, we experimentally validated a computationally designed TCR against Wilms’ tumour antigen, a therapeutically relevant target in leukaemia, excluded from our training and validation sets. Although the identified TCR shows cross-reactivity with pathogen-derived peptides, highlighting limitations in specificity, our work represents the successful computational design of a functional TCR construct against a non-viral epitope from the target sequence alone. Our findings establish a foundation for computational TCR design and reveal current limitations in data availability and methodology, providing a framework for accelerating personalized immunotherapy by reducing the search space for novel targets. TCR-TRANSLATE, a deep learning framework adapting machine translation to immune design, demonstrates the successful generation of a functional T cell receptor sequence for a cancer epitope from the target sequence alone.
{"title":"Conditional generation of real antigen-specific T cell receptor sequences","authors":"Dhuvarakesh Karthikeyan, Sarah N. Bennett, Amy G. Reynolds, Benjamin G. Vincent, Alex Rubinsteyn","doi":"10.1038/s42256-025-01096-6","DOIUrl":"10.1038/s42256-025-01096-6","url":null,"abstract":"Despite recent advances in T cell receptor (TCR) engineering, designing functional TCRs against arbitrary targets remains challenging due to complex rules governing cross-reactivity and limited paired data. Here we present TCR-TRANSLATE, a sequence-to-sequence framework that adapts low-resource machine translation techniques to generate antigen-specific TCR sequences against unseen epitopes. By evaluating 12 model variants of the BART and T5 model architectures, we identified key factors affecting performance and utility, revealing discordances between these objectives. Our flagship model, TCRT5, outperforms existing approaches on computational benchmarks, prioritizing functionally relevant sequences at higher ranks. Most significantly, we experimentally validated a computationally designed TCR against Wilms’ tumour antigen, a therapeutically relevant target in leukaemia, excluded from our training and validation sets. Although the identified TCR shows cross-reactivity with pathogen-derived peptides, highlighting limitations in specificity, our work represents the successful computational design of a functional TCR construct against a non-viral epitope from the target sequence alone. Our findings establish a foundation for computational TCR design and reveal current limitations in data availability and methodology, providing a framework for accelerating personalized immunotherapy by reducing the search space for novel targets. TCR-TRANSLATE, a deep learning framework adapting machine translation to immune design, demonstrates the successful generation of a functional T cell receptor sequence for a cancer epitope from the target sequence alone.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1494-1509"},"PeriodicalIF":23.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01096-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145009024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-08DOI: 10.1038/s42256-025-01107-6
Siyang Jiang, Hao Yang, Qipeng Xie, Chuan Ma, Sen Wang, Zhe Liu, Tao Xiang, Guoliang Xing
In highly regulated domains such as finance and healthcare, where stringent data-sharing constraints pose substantial obstacles, federated learning (FL) has emerged as a transformative paradigm in distributed machine learning, facilitating collaborative model training, preserving data decentralization and upholding governance standards. Despite its advantages, FL is vulnerable to poisoning attacks during central model aggregation, prompting the development of Byzantine-robust FL systems that use robust aggregation rules to counter malicious attacks. However, neural network models in such systems are susceptible to unintentionally memorizing and revealing individual training instances, thereby introducing substantial information leakage risks, as adversaries may exploit this vulnerability to reconstruct sensitive data through model outputs transmitted over the air. Existing solutions fall short of providing a viable Byzantine-robust FL system that is completely secure against information leakage and is computationally efficient. To address these concerns, we propose Lancelot, an efficient and effective Byzantine-robust FL framework that uses fully homomorphic encryption to safeguard against malicious client activities. Lancelot introduces a mask-based encrypted sorting mechanism that overcomes the limitations of multiplication depth in ciphertext sorting with zero information leakage. It incorporates cryptographic enhancements like lazy relinearization, dynamic hoisting and GPU acceleration to ensure practical computational efficiency. Extensive experiments demonstrate that Lancelot surpasses existing approaches, achieving a 20-fold enhancement in processing speed. Lancelot, a compute-efficient federated learning framework using homomorphic encryption to prevent information leakage, is presented, achieving 20 times faster processing speeds through advanced cryptographic and encrypted sorting techniques.
{"title":"Towards compute-efficient Byzantine-robust federated learning with fully homomorphic encryption","authors":"Siyang Jiang, Hao Yang, Qipeng Xie, Chuan Ma, Sen Wang, Zhe Liu, Tao Xiang, Guoliang Xing","doi":"10.1038/s42256-025-01107-6","DOIUrl":"10.1038/s42256-025-01107-6","url":null,"abstract":"In highly regulated domains such as finance and healthcare, where stringent data-sharing constraints pose substantial obstacles, federated learning (FL) has emerged as a transformative paradigm in distributed machine learning, facilitating collaborative model training, preserving data decentralization and upholding governance standards. Despite its advantages, FL is vulnerable to poisoning attacks during central model aggregation, prompting the development of Byzantine-robust FL systems that use robust aggregation rules to counter malicious attacks. However, neural network models in such systems are susceptible to unintentionally memorizing and revealing individual training instances, thereby introducing substantial information leakage risks, as adversaries may exploit this vulnerability to reconstruct sensitive data through model outputs transmitted over the air. Existing solutions fall short of providing a viable Byzantine-robust FL system that is completely secure against information leakage and is computationally efficient. To address these concerns, we propose Lancelot, an efficient and effective Byzantine-robust FL framework that uses fully homomorphic encryption to safeguard against malicious client activities. Lancelot introduces a mask-based encrypted sorting mechanism that overcomes the limitations of multiplication depth in ciphertext sorting with zero information leakage. It incorporates cryptographic enhancements like lazy relinearization, dynamic hoisting and GPU acceleration to ensure practical computational efficiency. Extensive experiments demonstrate that Lancelot surpasses existing approaches, achieving a 20-fold enhancement in processing speed. Lancelot, a compute-efficient federated learning framework using homomorphic encryption to prevent information leakage, is presented, achieving 20 times faster processing speeds through advanced cryptographic and encrypted sorting techniques.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1657-1668"},"PeriodicalIF":23.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145009023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}