Pub Date : 2025-11-06DOI: 10.1038/s42256-025-01130-7
Yulin Wang (, ), Yang Yue (, ), Yang Yue (, ), Huanqian Wang, Haojun Jiang, Yizeng Han, Zanlin Ni, Yifan Pu, Minglei Shi, Rui Lu, Qisen Yang, Andrew Zhao, Zhuofan Xia, Shiji Song, Gao Huang
Human vision is highly adaptive, efficiently sampling intricate environments by sequentially fixating on task-relevant regions. In contrast, prevailing machine vision models passively process entire scenes at once, resulting in excessive resource demands scaling with spatial–temporal input resolution and model size, yielding critical limitations impeding both future advancements and real-world application. Here we introduce AdaptiveNN, a general framework aiming to enable the transition from ‘passive’ to ‘active and adaptive’ vision models. AdaptiveNN formulates visual perception as a coarse-to-fine sequential decision-making process, progressively identifying and attending to regions pertinent to the task, incrementally combining information across fixations and actively concluding observation when sufficient. We establish a theory integrating representation learning with self-rewarding reinforcement learning, enabling end-to-end training of the non-differentiable AdaptiveNN without additional supervision on fixation locations. We assess AdaptiveNN on 17 benchmarks spanning 9 tasks, including large-scale visual recognition, fine-grained discrimination, visual search, processing images from real driving and medical scenarios, language-driven embodied artificial intelligence and side-by-side comparisons with humans. AdaptiveNN achieves up to 28 times inference cost reduction without sacrificing accuracy, flexibly adapts to varying task demands and resource budgets without retraining, and provides enhanced interpretability via its fixation patterns, demonstrating a promising avenue towards efficient, flexible and interpretable computer vision. Furthermore, AdaptiveNN exhibits closely human-like perceptual behaviours in many cases, revealing its potential as a valuable tool for investigating visual cognition. A deep learning approach, AdaptiveNN, shifts machine vision models from passive to active to mimic human-like perception. The method achieves inference costs that are up to 28-times lower without accuracy loss, while showcasing online-adaptable and interpretable behaviours.
{"title":"Emulating human-like adaptive vision for efficient and flexible machine visual perception","authors":"Yulin Wang \u0000 (, ), Yang Yue \u0000 (, ), Yang Yue \u0000 (, ), Huanqian Wang, Haojun Jiang, Yizeng Han, Zanlin Ni, Yifan Pu, Minglei Shi, Rui Lu, Qisen Yang, Andrew Zhao, Zhuofan Xia, Shiji Song, Gao Huang","doi":"10.1038/s42256-025-01130-7","DOIUrl":"10.1038/s42256-025-01130-7","url":null,"abstract":"Human vision is highly adaptive, efficiently sampling intricate environments by sequentially fixating on task-relevant regions. In contrast, prevailing machine vision models passively process entire scenes at once, resulting in excessive resource demands scaling with spatial–temporal input resolution and model size, yielding critical limitations impeding both future advancements and real-world application. Here we introduce AdaptiveNN, a general framework aiming to enable the transition from ‘passive’ to ‘active and adaptive’ vision models. AdaptiveNN formulates visual perception as a coarse-to-fine sequential decision-making process, progressively identifying and attending to regions pertinent to the task, incrementally combining information across fixations and actively concluding observation when sufficient. We establish a theory integrating representation learning with self-rewarding reinforcement learning, enabling end-to-end training of the non-differentiable AdaptiveNN without additional supervision on fixation locations. We assess AdaptiveNN on 17 benchmarks spanning 9 tasks, including large-scale visual recognition, fine-grained discrimination, visual search, processing images from real driving and medical scenarios, language-driven embodied artificial intelligence and side-by-side comparisons with humans. AdaptiveNN achieves up to 28 times inference cost reduction without sacrificing accuracy, flexibly adapts to varying task demands and resource budgets without retraining, and provides enhanced interpretability via its fixation patterns, demonstrating a promising avenue towards efficient, flexible and interpretable computer vision. Furthermore, AdaptiveNN exhibits closely human-like perceptual behaviours in many cases, revealing its potential as a valuable tool for investigating visual cognition. A deep learning approach, AdaptiveNN, shifts machine vision models from passive to active to mimic human-like perception. The method achieves inference costs that are up to 28-times lower without accuracy loss, while showcasing online-adaptable and interpretable behaviours.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1804-1822"},"PeriodicalIF":23.9,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145447396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-06DOI: 10.1038/s42256-025-01137-0
Chaojun Xiao, Jie Cai, Weilin Zhao, Biyuan Lin, Guoyang Zeng, Jie Zhou, Zhi Zheng, Xu Han, Zhiyuan Liu, Maosong Sun
Large language models (LLMs) have emerged as a milestone in artificial intelligence. The scaling law indicates that the performance of LLMs can continually improve as the model size increases, which poses challenges for training and deployment. Despite numerous efforts to improve LLM efficiency, there is no general consensus on development trends and evaluation metrics for efficiency of LLMs with different scales. To address this tension between model performance and efficiency, we introduce the concept of capability density as a metric to evaluate the quality of the LLMs and describe the trend of LLMs in terms of both effectiveness and efficiency. Intuitively, capability density can be understood as the capability contained within each unit of model parameters. Capability density provides a unified framework for assessing both model performance and efficiency. Here we show an empirical observation, called the ‘densing law’, that the capability density of LLMs grows exponentially over time. More specifically, using widely used benchmarks for evaluation, the maximum capability density of open-source LLMs doubles approximately every 3.5 months. This reveals that both parameter requirements and inference costs of LLMs for achieving equivalent performance decrease exponentially, offering insights for efficient LLM development strategies. Xiao et al. introduce ‘capability density’, defined as capability per parameter, as a metric for evaluating large language models. They report an empirical trend, the ‘densing law’, which states that capability density doubles approximately every 3.5 months, indicating that equivalent model performance can be achieved with exponentially fewer parameters over time.
{"title":"Densing law of LLMs","authors":"Chaojun Xiao, Jie Cai, Weilin Zhao, Biyuan Lin, Guoyang Zeng, Jie Zhou, Zhi Zheng, Xu Han, Zhiyuan Liu, Maosong Sun","doi":"10.1038/s42256-025-01137-0","DOIUrl":"10.1038/s42256-025-01137-0","url":null,"abstract":"Large language models (LLMs) have emerged as a milestone in artificial intelligence. The scaling law indicates that the performance of LLMs can continually improve as the model size increases, which poses challenges for training and deployment. Despite numerous efforts to improve LLM efficiency, there is no general consensus on development trends and evaluation metrics for efficiency of LLMs with different scales. To address this tension between model performance and efficiency, we introduce the concept of capability density as a metric to evaluate the quality of the LLMs and describe the trend of LLMs in terms of both effectiveness and efficiency. Intuitively, capability density can be understood as the capability contained within each unit of model parameters. Capability density provides a unified framework for assessing both model performance and efficiency. Here we show an empirical observation, called the ‘densing law’, that the capability density of LLMs grows exponentially over time. More specifically, using widely used benchmarks for evaluation, the maximum capability density of open-source LLMs doubles approximately every 3.5 months. This reveals that both parameter requirements and inference costs of LLMs for achieving equivalent performance decrease exponentially, offering insights for efficient LLM development strategies. Xiao et al. introduce ‘capability density’, defined as capability per parameter, as a metric for evaluating large language models. They report an empirical trend, the ‘densing law’, which states that capability density doubles approximately every 3.5 months, indicating that equivalent model performance can be achieved with exponentially fewer parameters over time.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1823-1833"},"PeriodicalIF":23.9,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01137-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145447372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.1038/s42256-025-01113-8
Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, James Zou
As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial. Suzgun et al. find that current large language models cannot reliably distinguish between belief, knowledge and fact, raising concerns for their use in healthcare, law and journalism, where such distinctions are critical.
{"title":"Language models cannot reliably distinguish belief from knowledge and fact","authors":"Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, James Zou","doi":"10.1038/s42256-025-01113-8","DOIUrl":"10.1038/s42256-025-01113-8","url":null,"abstract":"As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial. Suzgun et al. find that current large language models cannot reliably distinguish between belief, knowledge and fact, raising concerns for their use in healthcare, law and journalism, where such distinctions are critical.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1780-1790"},"PeriodicalIF":23.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145434427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-24DOI: 10.1038/s42256-025-01129-0
Ahmed Y. Ismail, Bradley A. A. Martin, Keith T. Butler
Molecular dynamics (MD) simulations are widely used for understanding atomic motion but require substantial computational time. In new research by Nam et al., a generative artificial intelligence framework is developed to accelerate the MD simulations for crystalline materials, by reframing the task as conditional generation of atomic displacement.
{"title":"Accelerating molecular dynamics by going with the flow","authors":"Ahmed Y. Ismail, Bradley A. A. Martin, Keith T. Butler","doi":"10.1038/s42256-025-01129-0","DOIUrl":"10.1038/s42256-025-01129-0","url":null,"abstract":"Molecular dynamics (MD) simulations are widely used for understanding atomic motion but require substantial computational time. In new research by Nam et al., a generative artificial intelligence framework is developed to accelerate the MD simulations for crystalline materials, by reframing the task as conditional generation of atomic displacement.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1598-1599"},"PeriodicalIF":23.9,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-23DOI: 10.1038/s42256-025-01136-1
Huanyu Tao, Xiaoyu Wang, Sheng-You Huang
Accurate prediction of protein–peptide interactions is critical for peptide drug discovery. However, due to the limited number of protein–peptide structures in the Protein Data Bank, it is challenging to train an accurate scoring function for protein–peptide interactions. Here, addressing this challenge, we propose an interaction-derived graph neural network model for scoring protein–peptide complexes, named GraphPep. GraphPep models protein–peptide interactions instead of traditional atoms or residues as graph nodes, and focuses on residue–residue contacts instead of a single peptide root mean square deviation in the loss function. Therefore, GraphPep can not only efficiently capture the most important protein–peptide interactions, but also mitigate the problem of limited training data. Moreover, the power of GraphPep is further enhanced by the ESM-2 protein language model. GraphPep is extensively evaluated on diverse decoy sets generated by various protein–peptide docking programs and AlphaFold, and is compared against state-of-the-art methods. The results demonstrate the accuracy and robustness of GraphPep. GraphPep presents an interaction-derived and protein language model-powered graph learning framework for robust scoring of protein–peptide complexes, substantially enhancing the binding mode prediction of protein–peptide docking.
{"title":"An interaction-derived graph learning framework for scoring protein–peptide complexes","authors":"Huanyu Tao, Xiaoyu Wang, Sheng-You Huang","doi":"10.1038/s42256-025-01136-1","DOIUrl":"10.1038/s42256-025-01136-1","url":null,"abstract":"Accurate prediction of protein–peptide interactions is critical for peptide drug discovery. However, due to the limited number of protein–peptide structures in the Protein Data Bank, it is challenging to train an accurate scoring function for protein–peptide interactions. Here, addressing this challenge, we propose an interaction-derived graph neural network model for scoring protein–peptide complexes, named GraphPep. GraphPep models protein–peptide interactions instead of traditional atoms or residues as graph nodes, and focuses on residue–residue contacts instead of a single peptide root mean square deviation in the loss function. Therefore, GraphPep can not only efficiently capture the most important protein–peptide interactions, but also mitigate the problem of limited training data. Moreover, the power of GraphPep is further enhanced by the ESM-2 protein language model. GraphPep is extensively evaluated on diverse decoy sets generated by various protein–peptide docking programs and AlphaFold, and is compared against state-of-the-art methods. The results demonstrate the accuracy and robustness of GraphPep. GraphPep presents an interaction-derived and protein language model-powered graph learning framework for robust scoring of protein–peptide complexes, substantially enhancing the binding mode prediction of protein–peptide docking.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1858-1869"},"PeriodicalIF":23.9,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145381720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01116-5
Xianyuan Liu, Jiayang Zhang, Shuo Zhou, Thijs L. van der Plas, Avish Vijayaraghavan, Anastasiia Grishina, Mengdie Zhuang, Daniel Schofield, Christopher Tomlinson, Yuhan Wang, Ruizhe Li, Louisa van Zeeland, Sina Tabakhi, Cyndie Demeocq, Xiang Li, Arunav Das, Orlando Timmerman, Thomas Baldwin-McDonald, Jinge Wu, Peizhen Bai, Zahraa Al Sahili, Omnia Alwazzan, Thao N. Do, Mohammod N. I. Suvon, Angeline Wang, Lucia Cipolina-Kun, Luigi A. Moretti, Lucas Farndale, Nitisha Jain, Natalia Efremova, Yan Ge, Marta Varela, Hak-Keung Lam, Oya Celiktutan, Ben R. Evans, Alejandro Coca-Castro, Honghan Wu, Zahraa S. Abdallah, Chen Chen, Valentin Danchev, Nataliya Tkachenko, Lei Lu, Tingting Zhu, Gregory G. Slabaugh, Roger K. Moore, William K. Cheung, Peter H. Charlton, Haiping Lu
Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction and decision-making across disciplines such as healthcare, science and engineering. However, most multimodal AI advances focus on models for vision and language data, and their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early on to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasize deeper integration across multiple levels of multimodality through stakeholder engagement and interdisciplinary collaboration to broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability and finance. By fostering interdisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact. Multimodal AI combines different types of data to improve decision-making in fields such as healthcare and engineering, but work so far has focused on vision and language models. To make these systems more usable in the real world, Liu et al. discuss the need to develop approaches with deployment in mind from the start, working closely with experts across relevant disciplines.
{"title":"Towards deployment-centric multimodal AI beyond vision and language","authors":"Xianyuan Liu, Jiayang Zhang, Shuo Zhou, Thijs L. van der Plas, Avish Vijayaraghavan, Anastasiia Grishina, Mengdie Zhuang, Daniel Schofield, Christopher Tomlinson, Yuhan Wang, Ruizhe Li, Louisa van Zeeland, Sina Tabakhi, Cyndie Demeocq, Xiang Li, Arunav Das, Orlando Timmerman, Thomas Baldwin-McDonald, Jinge Wu, Peizhen Bai, Zahraa Al Sahili, Omnia Alwazzan, Thao N. Do, Mohammod N. I. Suvon, Angeline Wang, Lucia Cipolina-Kun, Luigi A. Moretti, Lucas Farndale, Nitisha Jain, Natalia Efremova, Yan Ge, Marta Varela, Hak-Keung Lam, Oya Celiktutan, Ben R. Evans, Alejandro Coca-Castro, Honghan Wu, Zahraa S. Abdallah, Chen Chen, Valentin Danchev, Nataliya Tkachenko, Lei Lu, Tingting Zhu, Gregory G. Slabaugh, Roger K. Moore, William K. Cheung, Peter H. Charlton, Haiping Lu","doi":"10.1038/s42256-025-01116-5","DOIUrl":"10.1038/s42256-025-01116-5","url":null,"abstract":"Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction and decision-making across disciplines such as healthcare, science and engineering. However, most multimodal AI advances focus on models for vision and language data, and their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early on to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasize deeper integration across multiple levels of multimodality through stakeholder engagement and interdisciplinary collaboration to broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability and finance. By fostering interdisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact. Multimodal AI combines different types of data to improve decision-making in fields such as healthcare and engineering, but work so far has focused on vision and language models. To make these systems more usable in the real world, Liu et al. discuss the need to develop approaches with deployment in mind from the start, working closely with experts across relevant disciplines.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1612-1624"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01111-w
Jerome J. Choi, Noah Cohen Kalafut, Tim Gruenloh, Corinne D. Engelman, Tianyuan Lu, Daifeng Wang
Single-omics approaches often provide a limited perspective on complex biological systems, whereas multi-omics integration enables a more comprehensive understanding by combining diverse data views. However, integrating heterogeneous data types and interpreting complex relationships between biological features—both within and across views—remains a major challenge. Here, to address these challenges, we introduce COSIME (Cooperative Multi-view Integration with a Scalable and Interpretable Model Explainer). COSIME applies the backpropagation of a learnable optimal transport algorithm to deep neural networks, thus enabling the learning of latent features from several views to predict disease phenotypes. It also incorporates Monte Carlo sampling to enable interpretable assessments of both feature importance and pairwise feature interactions for both within and across views. We applied COSIME to both simulated and real-world datasets—including single-cell transcriptomics, spatial transcriptomics, epigenomics and metabolomics—to predict Alzheimer’s disease-related phenotypes. Benchmarking of existing methods demonstrated that COSIME improves prediction accuracy and provides interpretability. For example, it reveals that synergistic interactions between astrocyte and microglia genes associated with Alzheimer’s disease are more likely to localize at the edges of the middle temporal gyrus. Finally, COSIME is also publicly available as an open-source tool. Choi et al. introduce a machine learning model that integrates diverse multi-view data to predict disease phenotypes. The model includes an interpretable explainer that identifies interacting biological features, such as synergistic genes in astrocytes and microglia associated with Alzheimer’s disease.
{"title":"Cooperative multi-view integration with a scalable and interpretable model explainer","authors":"Jerome J. Choi, Noah Cohen Kalafut, Tim Gruenloh, Corinne D. Engelman, Tianyuan Lu, Daifeng Wang","doi":"10.1038/s42256-025-01111-w","DOIUrl":"10.1038/s42256-025-01111-w","url":null,"abstract":"Single-omics approaches often provide a limited perspective on complex biological systems, whereas multi-omics integration enables a more comprehensive understanding by combining diverse data views. However, integrating heterogeneous data types and interpreting complex relationships between biological features—both within and across views—remains a major challenge. Here, to address these challenges, we introduce COSIME (Cooperative Multi-view Integration with a Scalable and Interpretable Model Explainer). COSIME applies the backpropagation of a learnable optimal transport algorithm to deep neural networks, thus enabling the learning of latent features from several views to predict disease phenotypes. It also incorporates Monte Carlo sampling to enable interpretable assessments of both feature importance and pairwise feature interactions for both within and across views. We applied COSIME to both simulated and real-world datasets—including single-cell transcriptomics, spatial transcriptomics, epigenomics and metabolomics—to predict Alzheimer’s disease-related phenotypes. Benchmarking of existing methods demonstrated that COSIME improves prediction accuracy and provides interpretability. For example, it reveals that synergistic interactions between astrocyte and microglia genes associated with Alzheimer’s disease are more likely to localize at the edges of the middle temporal gyrus. Finally, COSIME is also publicly available as an open-source tool. Choi et al. introduce a machine learning model that integrates diverse multi-view data to predict disease phenotypes. The model includes an interpretable explainer that identifies interacting biological features, such as synergistic genes in astrocytes and microglia associated with Alzheimer’s disease.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1636-1656"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01124-5
David Graber, Peter Stockinger, Fabian Meyer, Siddhartha Mishra, Claus Horn, Rebecca Buller
The field of computational drug design requires accurate scoring functions to predict binding affinities for protein–ligand interactions. However, train–test data leakage between the PDBbind database and the Comparative Assessment of Scoring Function benchmark datasets has severely inflated the performance metrics of currently available deep-learning-based binding affinity prediction models, leading to overestimation of their generalization capabilities. Here we address this issue by proposing PDBbind CleanSplit, a training dataset curated by a new structure-based filtering algorithm that eliminates train–test data leakage as well as redundancies within the training set. Retraining current top-performing models on CleanSplit caused their benchmark performance to drop substantially, indicating that the performance of existing models is largely driven by data leakage. By contrast, our graph neural network model maintains high benchmark performance when trained on CleanSplit. Leveraging a sparse graph modelling of protein–ligand interactions and transfer learning from language models, our model is able to generalize to strictly independent test datasets. Graber et al. characterize biases and data leakage in protein–ligand datasets and show that a cleanly filtered training–test split leads to improved generalization in binding affinity prediction tasks.
{"title":"Resolving data bias improves generalization in binding affinity prediction","authors":"David Graber, Peter Stockinger, Fabian Meyer, Siddhartha Mishra, Claus Horn, Rebecca Buller","doi":"10.1038/s42256-025-01124-5","DOIUrl":"10.1038/s42256-025-01124-5","url":null,"abstract":"The field of computational drug design requires accurate scoring functions to predict binding affinities for protein–ligand interactions. However, train–test data leakage between the PDBbind database and the Comparative Assessment of Scoring Function benchmark datasets has severely inflated the performance metrics of currently available deep-learning-based binding affinity prediction models, leading to overestimation of their generalization capabilities. Here we address this issue by proposing PDBbind CleanSplit, a training dataset curated by a new structure-based filtering algorithm that eliminates train–test data leakage as well as redundancies within the training set. Retraining current top-performing models on CleanSplit caused their benchmark performance to drop substantially, indicating that the performance of existing models is largely driven by data leakage. By contrast, our graph neural network model maintains high benchmark performance when trained on CleanSplit. Leveraging a sparse graph modelling of protein–ligand interactions and transfer learning from language models, our model is able to generalize to strictly independent test datasets. Graber et al. characterize biases and data leakage in protein–ligand datasets and show that a cleanly filtered training–test split leads to improved generalization in binding affinity prediction tasks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1713-1725"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01124-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01139-y
Questions over whether neural networks learn universal or model-specific representations framed a community event at the Cognitive Computational Neuroscience conference in August 2025, highlighting future directions on a fundamental topic in NeuroAI.
{"title":"Are neural network representations universal or idiosyncratic?","authors":"","doi":"10.1038/s42256-025-01139-y","DOIUrl":"10.1038/s42256-025-01139-y","url":null,"abstract":"Questions over whether neural networks learn universal or model-specific representations framed a community event at the Cognitive Computational Neuroscience conference in August 2025, highlighting future directions on a fundamental topic in NeuroAI.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1589-1590"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01139-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-21DOI: 10.1038/s42256-025-01119-2
Yan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela
Computational protein and peptide design is emerging as a transformative framework for engineering macromolecules with precise structures and functions, offering innovative solutions in medicine, biotechnology and materials science. However, current methods predominantly rely on generative models, which are expensive to train and modify. Here, we introduce the Key-Cutting Machine (KCM), an optimization-based platform that iteratively leverages structure prediction to match desired backbone geometries. KCM requires only a single graphics processing unit and enables seamless incorporation of user-defined requirements into the objective function, circumventing the high retraining costs typical of generative models while allowing straightforward assessment of measurable properties. By employing an estimation of distribution algorithm, KCM optimizes sequences on the basis of geometric, physicochemical and energetic criteria. We benchmarked its performance on α-helices, β-sheets, a combination of both and unstructured regions, demonstrating precise backbone geometry design. As a proof of concept, we applied KCM to antimicrobial peptide design by using a template antimicrobial peptide as the ‘key’, yielding a candidate with potent in vitro activity against multiple bacterial strains and efficacy in a murine infection model. KCM thus emerges as a robust tool for de novo protein and peptide design, offering a flexible paradigm for replicating and extending the structure–function relationships of existing templates. Powerful generative AI models for designing biological macromolecules are being developed, with applications in medicine, biotechnology and materials science, but these models are expensive to train and modify. Leyva et al. introduce the Key-Cutting Machine, an optimization-based platform for proteins and peptides that iteratively leverages structure prediction to match desired backbone geometries.
{"title":"Tailored structured peptide design with a key-cutting machine approach","authors":"Yan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela","doi":"10.1038/s42256-025-01119-2","DOIUrl":"10.1038/s42256-025-01119-2","url":null,"abstract":"Computational protein and peptide design is emerging as a transformative framework for engineering macromolecules with precise structures and functions, offering innovative solutions in medicine, biotechnology and materials science. However, current methods predominantly rely on generative models, which are expensive to train and modify. Here, we introduce the Key-Cutting Machine (KCM), an optimization-based platform that iteratively leverages structure prediction to match desired backbone geometries. KCM requires only a single graphics processing unit and enables seamless incorporation of user-defined requirements into the objective function, circumventing the high retraining costs typical of generative models while allowing straightforward assessment of measurable properties. By employing an estimation of distribution algorithm, KCM optimizes sequences on the basis of geometric, physicochemical and energetic criteria. We benchmarked its performance on α-helices, β-sheets, a combination of both and unstructured regions, demonstrating precise backbone geometry design. As a proof of concept, we applied KCM to antimicrobial peptide design by using a template antimicrobial peptide as the ‘key’, yielding a candidate with potent in vitro activity against multiple bacterial strains and efficacy in a murine infection model. KCM thus emerges as a robust tool for de novo protein and peptide design, offering a flexible paradigm for replicating and extending the structure–function relationships of existing templates. Powerful generative AI models for designing biological macromolecules are being developed, with applications in medicine, biotechnology and materials science, but these models are expensive to train and modify. Leyva et al. introduce the Key-Cutting Machine, an optimization-based platform for proteins and peptides that iteratively leverages structure prediction to match desired backbone geometries.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1685-1697"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01119-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}