Pub Date : 2025-11-03DOI: 10.1038/s42256-025-01113-8
Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, James Zou
As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial. Suzgun et al. find that current large language models cannot reliably distinguish between belief, knowledge and fact, raising concerns for their use in healthcare, law and journalism, where such distinctions are critical.
{"title":"Language models cannot reliably distinguish belief from knowledge and fact","authors":"Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, James Zou","doi":"10.1038/s42256-025-01113-8","DOIUrl":"10.1038/s42256-025-01113-8","url":null,"abstract":"As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial. Suzgun et al. find that current large language models cannot reliably distinguish between belief, knowledge and fact, raising concerns for their use in healthcare, law and journalism, where such distinctions are critical.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1780-1790"},"PeriodicalIF":23.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145434427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-24DOI: 10.1038/s42256-025-01129-0
Ahmed Y. Ismail, Bradley A. A. Martin, Keith T. Butler
Molecular dynamics (MD) simulations are widely used for understanding atomic motion but require substantial computational time. In new research by Nam et al., a generative artificial intelligence framework is developed to accelerate the MD simulations for crystalline materials, by reframing the task as conditional generation of atomic displacement.
{"title":"Accelerating molecular dynamics by going with the flow","authors":"Ahmed Y. Ismail, Bradley A. A. Martin, Keith T. Butler","doi":"10.1038/s42256-025-01129-0","DOIUrl":"10.1038/s42256-025-01129-0","url":null,"abstract":"Molecular dynamics (MD) simulations are widely used for understanding atomic motion but require substantial computational time. In new research by Nam et al., a generative artificial intelligence framework is developed to accelerate the MD simulations for crystalline materials, by reframing the task as conditional generation of atomic displacement.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1598-1599"},"PeriodicalIF":23.9,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-23DOI: 10.1038/s42256-025-01136-1
Huanyu Tao, Xiaoyu Wang, Sheng-You Huang
Accurate prediction of protein–peptide interactions is critical for peptide drug discovery. However, due to the limited number of protein–peptide structures in the Protein Data Bank, it is challenging to train an accurate scoring function for protein–peptide interactions. Here, addressing this challenge, we propose an interaction-derived graph neural network model for scoring protein–peptide complexes, named GraphPep. GraphPep models protein–peptide interactions instead of traditional atoms or residues as graph nodes, and focuses on residue–residue contacts instead of a single peptide root mean square deviation in the loss function. Therefore, GraphPep can not only efficiently capture the most important protein–peptide interactions, but also mitigate the problem of limited training data. Moreover, the power of GraphPep is further enhanced by the ESM-2 protein language model. GraphPep is extensively evaluated on diverse decoy sets generated by various protein–peptide docking programs and AlphaFold, and is compared against state-of-the-art methods. The results demonstrate the accuracy and robustness of GraphPep. GraphPep presents an interaction-derived and protein language model-powered graph learning framework for robust scoring of protein–peptide complexes, substantially enhancing the binding mode prediction of protein–peptide docking.
{"title":"An interaction-derived graph learning framework for scoring protein–peptide complexes","authors":"Huanyu Tao, Xiaoyu Wang, Sheng-You Huang","doi":"10.1038/s42256-025-01136-1","DOIUrl":"10.1038/s42256-025-01136-1","url":null,"abstract":"Accurate prediction of protein–peptide interactions is critical for peptide drug discovery. However, due to the limited number of protein–peptide structures in the Protein Data Bank, it is challenging to train an accurate scoring function for protein–peptide interactions. Here, addressing this challenge, we propose an interaction-derived graph neural network model for scoring protein–peptide complexes, named GraphPep. GraphPep models protein–peptide interactions instead of traditional atoms or residues as graph nodes, and focuses on residue–residue contacts instead of a single peptide root mean square deviation in the loss function. Therefore, GraphPep can not only efficiently capture the most important protein–peptide interactions, but also mitigate the problem of limited training data. Moreover, the power of GraphPep is further enhanced by the ESM-2 protein language model. GraphPep is extensively evaluated on diverse decoy sets generated by various protein–peptide docking programs and AlphaFold, and is compared against state-of-the-art methods. The results demonstrate the accuracy and robustness of GraphPep. GraphPep presents an interaction-derived and protein language model-powered graph learning framework for robust scoring of protein–peptide complexes, substantially enhancing the binding mode prediction of protein–peptide docking.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1858-1869"},"PeriodicalIF":23.9,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145381720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01116-5
Xianyuan Liu, Jiayang Zhang, Shuo Zhou, Thijs L. van der Plas, Avish Vijayaraghavan, Anastasiia Grishina, Mengdie Zhuang, Daniel Schofield, Christopher Tomlinson, Yuhan Wang, Ruizhe Li, Louisa van Zeeland, Sina Tabakhi, Cyndie Demeocq, Xiang Li, Arunav Das, Orlando Timmerman, Thomas Baldwin-McDonald, Jinge Wu, Peizhen Bai, Zahraa Al Sahili, Omnia Alwazzan, Thao N. Do, Mohammod N. I. Suvon, Angeline Wang, Lucia Cipolina-Kun, Luigi A. Moretti, Lucas Farndale, Nitisha Jain, Natalia Efremova, Yan Ge, Marta Varela, Hak-Keung Lam, Oya Celiktutan, Ben R. Evans, Alejandro Coca-Castro, Honghan Wu, Zahraa S. Abdallah, Chen Chen, Valentin Danchev, Nataliya Tkachenko, Lei Lu, Tingting Zhu, Gregory G. Slabaugh, Roger K. Moore, William K. Cheung, Peter H. Charlton, Haiping Lu
Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction and decision-making across disciplines such as healthcare, science and engineering. However, most multimodal AI advances focus on models for vision and language data, and their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early on to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasize deeper integration across multiple levels of multimodality through stakeholder engagement and interdisciplinary collaboration to broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability and finance. By fostering interdisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact. Multimodal AI combines different types of data to improve decision-making in fields such as healthcare and engineering, but work so far has focused on vision and language models. To make these systems more usable in the real world, Liu et al. discuss the need to develop approaches with deployment in mind from the start, working closely with experts across relevant disciplines.
{"title":"Towards deployment-centric multimodal AI beyond vision and language","authors":"Xianyuan Liu, Jiayang Zhang, Shuo Zhou, Thijs L. van der Plas, Avish Vijayaraghavan, Anastasiia Grishina, Mengdie Zhuang, Daniel Schofield, Christopher Tomlinson, Yuhan Wang, Ruizhe Li, Louisa van Zeeland, Sina Tabakhi, Cyndie Demeocq, Xiang Li, Arunav Das, Orlando Timmerman, Thomas Baldwin-McDonald, Jinge Wu, Peizhen Bai, Zahraa Al Sahili, Omnia Alwazzan, Thao N. Do, Mohammod N. I. Suvon, Angeline Wang, Lucia Cipolina-Kun, Luigi A. Moretti, Lucas Farndale, Nitisha Jain, Natalia Efremova, Yan Ge, Marta Varela, Hak-Keung Lam, Oya Celiktutan, Ben R. Evans, Alejandro Coca-Castro, Honghan Wu, Zahraa S. Abdallah, Chen Chen, Valentin Danchev, Nataliya Tkachenko, Lei Lu, Tingting Zhu, Gregory G. Slabaugh, Roger K. Moore, William K. Cheung, Peter H. Charlton, Haiping Lu","doi":"10.1038/s42256-025-01116-5","DOIUrl":"10.1038/s42256-025-01116-5","url":null,"abstract":"Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction and decision-making across disciplines such as healthcare, science and engineering. However, most multimodal AI advances focus on models for vision and language data, and their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early on to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasize deeper integration across multiple levels of multimodality through stakeholder engagement and interdisciplinary collaboration to broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability and finance. By fostering interdisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact. Multimodal AI combines different types of data to improve decision-making in fields such as healthcare and engineering, but work so far has focused on vision and language models. To make these systems more usable in the real world, Liu et al. discuss the need to develop approaches with deployment in mind from the start, working closely with experts across relevant disciplines.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1612-1624"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01111-w
Jerome J. Choi, Noah Cohen Kalafut, Tim Gruenloh, Corinne D. Engelman, Tianyuan Lu, Daifeng Wang
Single-omics approaches often provide a limited perspective on complex biological systems, whereas multi-omics integration enables a more comprehensive understanding by combining diverse data views. However, integrating heterogeneous data types and interpreting complex relationships between biological features—both within and across views—remains a major challenge. Here, to address these challenges, we introduce COSIME (Cooperative Multi-view Integration with a Scalable and Interpretable Model Explainer). COSIME applies the backpropagation of a learnable optimal transport algorithm to deep neural networks, thus enabling the learning of latent features from several views to predict disease phenotypes. It also incorporates Monte Carlo sampling to enable interpretable assessments of both feature importance and pairwise feature interactions for both within and across views. We applied COSIME to both simulated and real-world datasets—including single-cell transcriptomics, spatial transcriptomics, epigenomics and metabolomics—to predict Alzheimer’s disease-related phenotypes. Benchmarking of existing methods demonstrated that COSIME improves prediction accuracy and provides interpretability. For example, it reveals that synergistic interactions between astrocyte and microglia genes associated with Alzheimer’s disease are more likely to localize at the edges of the middle temporal gyrus. Finally, COSIME is also publicly available as an open-source tool. Choi et al. introduce a machine learning model that integrates diverse multi-view data to predict disease phenotypes. The model includes an interpretable explainer that identifies interacting biological features, such as synergistic genes in astrocytes and microglia associated with Alzheimer’s disease.
{"title":"Cooperative multi-view integration with a scalable and interpretable model explainer","authors":"Jerome J. Choi, Noah Cohen Kalafut, Tim Gruenloh, Corinne D. Engelman, Tianyuan Lu, Daifeng Wang","doi":"10.1038/s42256-025-01111-w","DOIUrl":"10.1038/s42256-025-01111-w","url":null,"abstract":"Single-omics approaches often provide a limited perspective on complex biological systems, whereas multi-omics integration enables a more comprehensive understanding by combining diverse data views. However, integrating heterogeneous data types and interpreting complex relationships between biological features—both within and across views—remains a major challenge. Here, to address these challenges, we introduce COSIME (Cooperative Multi-view Integration with a Scalable and Interpretable Model Explainer). COSIME applies the backpropagation of a learnable optimal transport algorithm to deep neural networks, thus enabling the learning of latent features from several views to predict disease phenotypes. It also incorporates Monte Carlo sampling to enable interpretable assessments of both feature importance and pairwise feature interactions for both within and across views. We applied COSIME to both simulated and real-world datasets—including single-cell transcriptomics, spatial transcriptomics, epigenomics and metabolomics—to predict Alzheimer’s disease-related phenotypes. Benchmarking of existing methods demonstrated that COSIME improves prediction accuracy and provides interpretability. For example, it reveals that synergistic interactions between astrocyte and microglia genes associated with Alzheimer’s disease are more likely to localize at the edges of the middle temporal gyrus. Finally, COSIME is also publicly available as an open-source tool. Choi et al. introduce a machine learning model that integrates diverse multi-view data to predict disease phenotypes. The model includes an interpretable explainer that identifies interacting biological features, such as synergistic genes in astrocytes and microglia associated with Alzheimer’s disease.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1636-1656"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01124-5
David Graber, Peter Stockinger, Fabian Meyer, Siddhartha Mishra, Claus Horn, Rebecca Buller
The field of computational drug design requires accurate scoring functions to predict binding affinities for protein–ligand interactions. However, train–test data leakage between the PDBbind database and the Comparative Assessment of Scoring Function benchmark datasets has severely inflated the performance metrics of currently available deep-learning-based binding affinity prediction models, leading to overestimation of their generalization capabilities. Here we address this issue by proposing PDBbind CleanSplit, a training dataset curated by a new structure-based filtering algorithm that eliminates train–test data leakage as well as redundancies within the training set. Retraining current top-performing models on CleanSplit caused their benchmark performance to drop substantially, indicating that the performance of existing models is largely driven by data leakage. By contrast, our graph neural network model maintains high benchmark performance when trained on CleanSplit. Leveraging a sparse graph modelling of protein–ligand interactions and transfer learning from language models, our model is able to generalize to strictly independent test datasets. Graber et al. characterize biases and data leakage in protein–ligand datasets and show that a cleanly filtered training–test split leads to improved generalization in binding affinity prediction tasks.
{"title":"Resolving data bias improves generalization in binding affinity prediction","authors":"David Graber, Peter Stockinger, Fabian Meyer, Siddhartha Mishra, Claus Horn, Rebecca Buller","doi":"10.1038/s42256-025-01124-5","DOIUrl":"10.1038/s42256-025-01124-5","url":null,"abstract":"The field of computational drug design requires accurate scoring functions to predict binding affinities for protein–ligand interactions. However, train–test data leakage between the PDBbind database and the Comparative Assessment of Scoring Function benchmark datasets has severely inflated the performance metrics of currently available deep-learning-based binding affinity prediction models, leading to overestimation of their generalization capabilities. Here we address this issue by proposing PDBbind CleanSplit, a training dataset curated by a new structure-based filtering algorithm that eliminates train–test data leakage as well as redundancies within the training set. Retraining current top-performing models on CleanSplit caused their benchmark performance to drop substantially, indicating that the performance of existing models is largely driven by data leakage. By contrast, our graph neural network model maintains high benchmark performance when trained on CleanSplit. Leveraging a sparse graph modelling of protein–ligand interactions and transfer learning from language models, our model is able to generalize to strictly independent test datasets. Graber et al. characterize biases and data leakage in protein–ligand datasets and show that a cleanly filtered training–test split leads to improved generalization in binding affinity prediction tasks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1713-1725"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01124-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01139-y
Questions over whether neural networks learn universal or model-specific representations framed a community event at the Cognitive Computational Neuroscience conference in August 2025, highlighting future directions on a fundamental topic in NeuroAI.
{"title":"Are neural network representations universal or idiosyncratic?","authors":"","doi":"10.1038/s42256-025-01139-y","DOIUrl":"10.1038/s42256-025-01139-y","url":null,"abstract":"Questions over whether neural networks learn universal or model-specific representations framed a community event at the Cognitive Computational Neuroscience conference in August 2025, highlighting future directions on a fundamental topic in NeuroAI.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1589-1590"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01139-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01119-2
Yan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela
Computational protein and peptide design is emerging as a transformative framework for engineering macromolecules with precise structures and functions, offering innovative solutions in medicine, biotechnology and materials science. However, current methods predominantly rely on generative models, which are expensive to train and modify. Here, we introduce the Key-Cutting Machine (KCM), an optimization-based platform that iteratively leverages structure prediction to match desired backbone geometries. KCM requires only a single graphics processing unit and enables seamless incorporation of user-defined requirements into the objective function, circumventing the high retraining costs typical of generative models while allowing straightforward assessment of measurable properties. By employing an estimation of distribution algorithm, KCM optimizes sequences on the basis of geometric, physicochemical and energetic criteria. We benchmarked its performance on α-helices, β-sheets, a combination of both and unstructured regions, demonstrating precise backbone geometry design. As a proof of concept, we applied KCM to antimicrobial peptide design by using a template antimicrobial peptide as the ‘key’, yielding a candidate with potent in vitro activity against multiple bacterial strains and efficacy in a murine infection model. KCM thus emerges as a robust tool for de novo protein and peptide design, offering a flexible paradigm for replicating and extending the structure–function relationships of existing templates. Powerful generative AI models for designing biological macromolecules are being developed, with applications in medicine, biotechnology and materials science, but these models are expensive to train and modify. Leyva et al. introduce the Key-Cutting Machine, an optimization-based platform for proteins and peptides that iteratively leverages structure prediction to match desired backbone geometries.
{"title":"Tailored structured peptide design with a key-cutting machine approach","authors":"Yan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela","doi":"10.1038/s42256-025-01119-2","DOIUrl":"10.1038/s42256-025-01119-2","url":null,"abstract":"Computational protein and peptide design is emerging as a transformative framework for engineering macromolecules with precise structures and functions, offering innovative solutions in medicine, biotechnology and materials science. However, current methods predominantly rely on generative models, which are expensive to train and modify. Here, we introduce the Key-Cutting Machine (KCM), an optimization-based platform that iteratively leverages structure prediction to match desired backbone geometries. KCM requires only a single graphics processing unit and enables seamless incorporation of user-defined requirements into the objective function, circumventing the high retraining costs typical of generative models while allowing straightforward assessment of measurable properties. By employing an estimation of distribution algorithm, KCM optimizes sequences on the basis of geometric, physicochemical and energetic criteria. We benchmarked its performance on α-helices, β-sheets, a combination of both and unstructured regions, demonstrating precise backbone geometry design. As a proof of concept, we applied KCM to antimicrobial peptide design by using a template antimicrobial peptide as the ‘key’, yielding a candidate with potent in vitro activity against multiple bacterial strains and efficacy in a murine infection model. KCM thus emerges as a robust tool for de novo protein and peptide design, offering a flexible paradigm for replicating and extending the structure–function relationships of existing templates. Powerful generative AI models for designing biological macromolecules are being developed, with applications in medicine, biotechnology and materials science, but these models are expensive to train and modify. Leyva et al. introduce the Key-Cutting Machine, an optimization-based platform for proteins and peptides that iteratively leverages structure prediction to match desired backbone geometries.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1685-1697"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01119-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-20DOI: 10.1038/s42256-025-01121-8
Kazuki Irie, Brenden M. Lake
Since the earliest proposals for artificial neural network models of the mind and brain, critics have pointed out key weaknesses in these models compared with human cognitive abilities. Here we review recent work that uses metalearning to overcome several classic challenges, which we characterize as addressing the problem of incentive and practice—that is, providing machines with both incentives to improve specific skills and opportunities to practice those skills. This explicit optimization contrasts with more conventional approaches that hope that the desired behaviour will emerge through optimizing related but different objectives. We review applications of this principle to address four classic challenges for artificial neural networks: systematic generalization, catastrophic forgetting, few-shot learning and multi-step reasoning. We also discuss how large language models incorporate key aspects of this metalearning framework (namely, sequence prediction with feedback trained on diverse data), which helps to explain some of their successes on these classic challenges. Finally, we discuss the prospects for understanding aspects of human development through this framework, and whether natural environments provide the right incentives and practice for learning how to make challenging generalizations. Irie and Lake present a metalearning framework that enables artificial neural networks to address classic challenges by providing both incentives to improve specific capabilities and opportunities to practice them.
{"title":"Overcoming classic challenges for artificial neural networks by providing incentives and practice","authors":"Kazuki Irie, Brenden M. Lake","doi":"10.1038/s42256-025-01121-8","DOIUrl":"10.1038/s42256-025-01121-8","url":null,"abstract":"Since the earliest proposals for artificial neural network models of the mind and brain, critics have pointed out key weaknesses in these models compared with human cognitive abilities. Here we review recent work that uses metalearning to overcome several classic challenges, which we characterize as addressing the problem of incentive and practice—that is, providing machines with both incentives to improve specific skills and opportunities to practice those skills. This explicit optimization contrasts with more conventional approaches that hope that the desired behaviour will emerge through optimizing related but different objectives. We review applications of this principle to address four classic challenges for artificial neural networks: systematic generalization, catastrophic forgetting, few-shot learning and multi-step reasoning. We also discuss how large language models incorporate key aspects of this metalearning framework (namely, sequence prediction with feedback trained on diverse data), which helps to explain some of their successes on these classic challenges. Finally, we discuss the prospects for understanding aspects of human development through this framework, and whether natural environments provide the right incentives and practice for learning how to make challenging generalizations. Irie and Lake present a metalearning framework that enables artificial neural networks to address classic challenges by providing both incentives to improve specific capabilities and opportunities to practice them.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1602-1611"},"PeriodicalIF":23.9,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-20DOI: 10.1038/s42256-025-01127-2
Pavel Tolmachev, Tatiana A. Engel
Trained recurrent neural networks (RNNs) have become the leading framework for modelling neural dynamics in the brain, owing to their capacity to mimic how population-level computations arise from interactions among many units with heterogeneous responses. RNN units are commonly modelled using various nonlinear activation functions, assuming these architectural differences do not affect emerging task solutions. Here, contrary to this view, we show that single-unit activation functions confer inductive biases that influence the geometry of neural population trajectories, single-unit selectivity and fixed-point configurations. Using a model distillation approach, we find that differences in neural representations and dynamics reflect qualitatively distinct circuit solutions to cognitive tasks emerging in RNNs with different activation functions, leading to disparate generalization behaviour on out-of-distribution inputs. Our results show that seemingly minor architectural differences provide strong inductive biases for task solutions, raising a question about which RNN architectures better align with mechanisms of task execution in biological networks. Recurrent neural networks are widely used to model brain dynamics. Tolmachev and Engel show that single-unit activation functions influence task solutions that emerge in trained networks, raising the question of which design choices best align with biology.
{"title":"Single-unit activations confer inductive biases for emergent circuit solutions to cognitive tasks","authors":"Pavel Tolmachev, Tatiana A. Engel","doi":"10.1038/s42256-025-01127-2","DOIUrl":"10.1038/s42256-025-01127-2","url":null,"abstract":"Trained recurrent neural networks (RNNs) have become the leading framework for modelling neural dynamics in the brain, owing to their capacity to mimic how population-level computations arise from interactions among many units with heterogeneous responses. RNN units are commonly modelled using various nonlinear activation functions, assuming these architectural differences do not affect emerging task solutions. Here, contrary to this view, we show that single-unit activation functions confer inductive biases that influence the geometry of neural population trajectories, single-unit selectivity and fixed-point configurations. Using a model distillation approach, we find that differences in neural representations and dynamics reflect qualitatively distinct circuit solutions to cognitive tasks emerging in RNNs with different activation functions, leading to disparate generalization behaviour on out-of-distribution inputs. Our results show that seemingly minor architectural differences provide strong inductive biases for task solutions, raising a question about which RNN architectures better align with mechanisms of task execution in biological networks. Recurrent neural networks are widely used to model brain dynamics. Tolmachev and Engel show that single-unit activation functions influence task solutions that emerge in trained networks, raising the question of which design choices best align with biology.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1742-1754"},"PeriodicalIF":23.9,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01127-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}