首页 > 最新文献

Nature Machine Intelligence最新文献

英文 中文
Towards deployment-centric multimodal AI beyond vision and language 朝着超越视觉和语言的以部署为中心的多模式人工智能发展
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-21 DOI: 10.1038/s42256-025-01116-5
Xianyuan Liu, Jiayang Zhang, Shuo Zhou, Thijs L. van der Plas, Avish Vijayaraghavan, Anastasiia Grishina, Mengdie Zhuang, Daniel Schofield, Christopher Tomlinson, Yuhan Wang, Ruizhe Li, Louisa van Zeeland, Sina Tabakhi, Cyndie Demeocq, Xiang Li, Arunav Das, Orlando Timmerman, Thomas Baldwin-McDonald, Jinge Wu, Peizhen Bai, Zahraa Al Sahili, Omnia Alwazzan, Thao N. Do, Mohammod N. I. Suvon, Angeline Wang, Lucia Cipolina-Kun, Luigi A. Moretti, Lucas Farndale, Nitisha Jain, Natalia Efremova, Yan Ge, Marta Varela, Hak-Keung Lam, Oya Celiktutan, Ben R. Evans, Alejandro Coca-Castro, Honghan Wu, Zahraa S. Abdallah, Chen Chen, Valentin Danchev, Nataliya Tkachenko, Lei Lu, Tingting Zhu, Gregory G. Slabaugh, Roger K. Moore, William K. Cheung, Peter H. Charlton, Haiping Lu
Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction and decision-making across disciplines such as healthcare, science and engineering. However, most multimodal AI advances focus on models for vision and language data, and their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early on to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasize deeper integration across multiple levels of multimodality through stakeholder engagement and interdisciplinary collaboration to broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability and finance. By fostering interdisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact. Multimodal AI combines different types of data to improve decision-making in fields such as healthcare and engineering, but work so far has focused on vision and language models. To make these systems more usable in the real world, Liu et al. discuss the need to develop approaches with deployment in mind from the start, working closely with experts across relevant disciplines.
多模式人工智能(AI)通过机器学习集成不同类型的数据,以提高医疗保健、科学和工程等跨学科的理解、预测和决策。然而,大多数多模式人工智能的进展都集中在视觉和语言数据模型上,它们的可部署性仍然是一个关键挑战。我们提倡以部署为中心的工作流,它在早期就合并部署约束,以减少不可部署解决方案的可能性,并补充以数据为中心和以模型为中心的方法。我们还强调通过利益相关者参与和跨学科合作,在多个层面上进行更深入的整合,以扩大视觉和语言之外的研究范围。为了促进这一方法,我们确定了跨学科共享的常见多模式人工智能特定挑战,并研究了三个现实世界的用例:流行病应对、自动驾驶汽车设计和气候变化适应,并从医疗保健、社会科学、工程、科学、可持续性和金融领域汲取专业知识。通过促进跨学科的对话和开放的研究实践,我们的社区可以加速以部署为中心的发展,产生广泛的社会影响。多模式人工智能结合了不同类型的数据,以改善医疗保健和工程等领域的决策,但迄今为止的工作主要集中在视觉和语言模型上。为了使这些系统在现实世界中更可用,Liu等人讨论了从一开始就考虑部署的开发方法的必要性,并与相关学科的专家密切合作。
{"title":"Towards deployment-centric multimodal AI beyond vision and language","authors":"Xianyuan Liu, Jiayang Zhang, Shuo Zhou, Thijs L. van der Plas, Avish Vijayaraghavan, Anastasiia Grishina, Mengdie Zhuang, Daniel Schofield, Christopher Tomlinson, Yuhan Wang, Ruizhe Li, Louisa van Zeeland, Sina Tabakhi, Cyndie Demeocq, Xiang Li, Arunav Das, Orlando Timmerman, Thomas Baldwin-McDonald, Jinge Wu, Peizhen Bai, Zahraa Al Sahili, Omnia Alwazzan, Thao N. Do, Mohammod N. I. Suvon, Angeline Wang, Lucia Cipolina-Kun, Luigi A. Moretti, Lucas Farndale, Nitisha Jain, Natalia Efremova, Yan Ge, Marta Varela, Hak-Keung Lam, Oya Celiktutan, Ben R. Evans, Alejandro Coca-Castro, Honghan Wu, Zahraa S. Abdallah, Chen Chen, Valentin Danchev, Nataliya Tkachenko, Lei Lu, Tingting Zhu, Gregory G. Slabaugh, Roger K. Moore, William K. Cheung, Peter H. Charlton, Haiping Lu","doi":"10.1038/s42256-025-01116-5","DOIUrl":"10.1038/s42256-025-01116-5","url":null,"abstract":"Multimodal artificial intelligence (AI) integrates diverse types of data via machine learning to improve understanding, prediction and decision-making across disciplines such as healthcare, science and engineering. However, most multimodal AI advances focus on models for vision and language data, and their deployability remains a key challenge. We advocate a deployment-centric workflow that incorporates deployment constraints early on to reduce the likelihood of undeployable solutions, complementing data-centric and model-centric approaches. We also emphasize deeper integration across multiple levels of multimodality through stakeholder engagement and interdisciplinary collaboration to broaden the research scope beyond vision and language. To facilitate this approach, we identify common multimodal-AI-specific challenges shared across disciplines and examine three real-world use cases: pandemic response, self-driving car design and climate change adaptation, drawing expertise from healthcare, social science, engineering, science, sustainability and finance. By fostering interdisciplinary dialogue and open research practices, our community can accelerate deployment-centric development for broad societal impact. Multimodal AI combines different types of data to improve decision-making in fields such as healthcare and engineering, but work so far has focused on vision and language models. To make these systems more usable in the real world, Liu et al. discuss the need to develop approaches with deployment in mind from the start, working closely with experts across relevant disciplines.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1612-1624"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cooperative multi-view integration with a scalable and interpretable model explainer 具有可伸缩和可解释的模型解释器的协作多视图集成
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-21 DOI: 10.1038/s42256-025-01111-w
Jerome J. Choi, Noah Cohen Kalafut, Tim Gruenloh, Corinne D. Engelman, Tianyuan Lu, Daifeng Wang
Single-omics approaches often provide a limited perspective on complex biological systems, whereas multi-omics integration enables a more comprehensive understanding by combining diverse data views. However, integrating heterogeneous data types and interpreting complex relationships between biological features—both within and across views—remains a major challenge. Here, to address these challenges, we introduce COSIME (Cooperative Multi-view Integration with a Scalable and Interpretable Model Explainer). COSIME applies the backpropagation of a learnable optimal transport algorithm to deep neural networks, thus enabling the learning of latent features from several views to predict disease phenotypes. It also incorporates Monte Carlo sampling to enable interpretable assessments of both feature importance and pairwise feature interactions for both within and across views. We applied COSIME to both simulated and real-world datasets—including single-cell transcriptomics, spatial transcriptomics, epigenomics and metabolomics—to predict Alzheimer’s disease-related phenotypes. Benchmarking of existing methods demonstrated that COSIME improves prediction accuracy and provides interpretability. For example, it reveals that synergistic interactions between astrocyte and microglia genes associated with Alzheimer’s disease are more likely to localize at the edges of the middle temporal gyrus. Finally, COSIME is also publicly available as an open-source tool. Choi et al. introduce a machine learning model that integrates diverse multi-view data to predict disease phenotypes. The model includes an interpretable explainer that identifies interacting biological features, such as synergistic genes in astrocytes and microglia associated with Alzheimer’s disease.
单组学方法通常对复杂的生物系统提供有限的视角,而多组学集成通过结合不同的数据视图可以更全面地理解。然而,集成异构数据类型和解释生物特征之间的复杂关系(包括视图内部和视图之间)仍然是一个主要挑战。在这里,为了解决这些挑战,我们引入了COSIME(带有可伸缩和可解释模型解释器的协作多视图集成)。COSIME将可学习的最优传输算法的反向传播应用于深度神经网络,从而能够从多个角度学习潜在特征以预测疾病表型。它还集成了蒙特卡罗采样,以便对视图内和视图间的特征重要性和两两特征交互进行可解释的评估。我们将COSIME应用于模拟和现实世界的数据集,包括单细胞转录组学、空间转录组学、表观基因组学和代谢组学,以预测阿尔茨海默病相关的表型。现有方法的基准测试表明,COSIME提高了预测精度并提供了可解释性。例如,它揭示了与阿尔茨海默病相关的星形胶质细胞和小胶质细胞基因之间的协同相互作用更可能定位于中颞回的边缘。最后,COSIME也是一个公开的开源工具。Choi等人介绍了一种机器学习模型,该模型集成了多种多视图数据来预测疾病表型。该模型包括一个可解释的解释器,可以识别相互作用的生物学特征,例如与阿尔茨海默病相关的星形胶质细胞和小胶质细胞中的协同基因。
{"title":"Cooperative multi-view integration with a scalable and interpretable model explainer","authors":"Jerome J. Choi, Noah Cohen Kalafut, Tim Gruenloh, Corinne D. Engelman, Tianyuan Lu, Daifeng Wang","doi":"10.1038/s42256-025-01111-w","DOIUrl":"10.1038/s42256-025-01111-w","url":null,"abstract":"Single-omics approaches often provide a limited perspective on complex biological systems, whereas multi-omics integration enables a more comprehensive understanding by combining diverse data views. However, integrating heterogeneous data types and interpreting complex relationships between biological features—both within and across views—remains a major challenge. Here, to address these challenges, we introduce COSIME (Cooperative Multi-view Integration with a Scalable and Interpretable Model Explainer). COSIME applies the backpropagation of a learnable optimal transport algorithm to deep neural networks, thus enabling the learning of latent features from several views to predict disease phenotypes. It also incorporates Monte Carlo sampling to enable interpretable assessments of both feature importance and pairwise feature interactions for both within and across views. We applied COSIME to both simulated and real-world datasets—including single-cell transcriptomics, spatial transcriptomics, epigenomics and metabolomics—to predict Alzheimer’s disease-related phenotypes. Benchmarking of existing methods demonstrated that COSIME improves prediction accuracy and provides interpretability. For example, it reveals that synergistic interactions between astrocyte and microglia genes associated with Alzheimer’s disease are more likely to localize at the edges of the middle temporal gyrus. Finally, COSIME is also publicly available as an open-source tool. Choi et al. introduce a machine learning model that integrates diverse multi-view data to predict disease phenotypes. The model includes an interpretable explainer that identifies interacting biological features, such as synergistic genes in astrocytes and microglia associated with Alzheimer’s disease.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1636-1656"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resolving data bias improves generalization in binding affinity prediction 解决数据偏差提高了绑定亲和预测的泛化
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-21 DOI: 10.1038/s42256-025-01124-5
David Graber, Peter Stockinger, Fabian Meyer, Siddhartha Mishra, Claus Horn, Rebecca Buller
The field of computational drug design requires accurate scoring functions to predict binding affinities for protein–ligand interactions. However, train–test data leakage between the PDBbind database and the Comparative Assessment of Scoring Function benchmark datasets has severely inflated the performance metrics of currently available deep-learning-based binding affinity prediction models, leading to overestimation of their generalization capabilities. Here we address this issue by proposing PDBbind CleanSplit, a training dataset curated by a new structure-based filtering algorithm that eliminates train–test data leakage as well as redundancies within the training set. Retraining current top-performing models on CleanSplit caused their benchmark performance to drop substantially, indicating that the performance of existing models is largely driven by data leakage. By contrast, our graph neural network model maintains high benchmark performance when trained on CleanSplit. Leveraging a sparse graph modelling of protein–ligand interactions and transfer learning from language models, our model is able to generalize to strictly independent test datasets. Graber et al. characterize biases and data leakage in protein–ligand datasets and show that a cleanly filtered training–test split leads to improved generalization in binding affinity prediction tasks.
计算药物设计领域需要精确的评分函数来预测蛋白质-配体相互作用的结合亲和力。然而,pdbind数据库和评分函数比较评估基准数据集之间的训练测试数据泄漏严重夸大了当前可用的基于深度学习的绑定亲和度预测模型的性能指标,导致对其泛化能力的高估。在这里,我们通过提出PDBbind CleanSplit来解决这个问题,PDBbind CleanSplit是一个由新的基于结构的过滤算法管理的训练数据集,该算法消除了训练集内的训练测试数据泄漏和冗余。在CleanSplit上重新训练当前表现最好的模型会导致它们的基准性能大幅下降,这表明现有模型的性能在很大程度上是由数据泄漏驱动的。相比之下,我们的图神经网络模型在CleanSplit上训练时保持了很高的基准性能。利用蛋白质-配体相互作用的稀疏图建模和语言模型的迁移学习,我们的模型能够推广到严格独立的测试数据集。Graber等人描述了蛋白质配体数据集中的偏差和数据泄漏,并表明干净过滤的训练-测试分割可以提高结合亲和性预测任务的泛化程度。
{"title":"Resolving data bias improves generalization in binding affinity prediction","authors":"David Graber, Peter Stockinger, Fabian Meyer, Siddhartha Mishra, Claus Horn, Rebecca Buller","doi":"10.1038/s42256-025-01124-5","DOIUrl":"10.1038/s42256-025-01124-5","url":null,"abstract":"The field of computational drug design requires accurate scoring functions to predict binding affinities for protein–ligand interactions. However, train–test data leakage between the PDBbind database and the Comparative Assessment of Scoring Function benchmark datasets has severely inflated the performance metrics of currently available deep-learning-based binding affinity prediction models, leading to overestimation of their generalization capabilities. Here we address this issue by proposing PDBbind CleanSplit, a training dataset curated by a new structure-based filtering algorithm that eliminates train–test data leakage as well as redundancies within the training set. Retraining current top-performing models on CleanSplit caused their benchmark performance to drop substantially, indicating that the performance of existing models is largely driven by data leakage. By contrast, our graph neural network model maintains high benchmark performance when trained on CleanSplit. Leveraging a sparse graph modelling of protein–ligand interactions and transfer learning from language models, our model is able to generalize to strictly independent test datasets. Graber et al. characterize biases and data leakage in protein–ligand datasets and show that a cleanly filtered training–test split leads to improved generalization in binding affinity prediction tasks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1713-1725"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01124-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Are neural network representations universal or idiosyncratic? 神经网络表示是通用的还是特殊的?
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-21 DOI: 10.1038/s42256-025-01139-y
Questions over whether neural networks learn universal or model-specific representations framed a community event at the Cognitive Computational Neuroscience conference in August 2025, highlighting future directions on a fundamental topic in NeuroAI.
在2025年8月的认知计算神经科学会议上,关于神经网络是学习通用表征还是特定模型表征的问题构成了一个社区活动,突出了神经人工智能基础主题的未来方向。
{"title":"Are neural network representations universal or idiosyncratic?","authors":"","doi":"10.1038/s42256-025-01139-y","DOIUrl":"10.1038/s42256-025-01139-y","url":null,"abstract":"Questions over whether neural networks learn universal or model-specific representations framed a community event at the Cognitive Computational Neuroscience conference in August 2025, highlighting future directions on a fundamental topic in NeuroAI.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1589-1590"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01139-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tailored structured peptide design with a key-cutting machine approach 量身定制的结构肽设计与钥匙切割机的方法
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-21 DOI: 10.1038/s42256-025-01119-2
Yan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela
Computational protein and peptide design is emerging as a transformative framework for engineering macromolecules with precise structures and functions, offering innovative solutions in medicine, biotechnology and materials science. However, current methods predominantly rely on generative models, which are expensive to train and modify. Here, we introduce the Key-Cutting Machine (KCM), an optimization-based platform that iteratively leverages structure prediction to match desired backbone geometries. KCM requires only a single graphics processing unit and enables seamless incorporation of user-defined requirements into the objective function, circumventing the high retraining costs typical of generative models while allowing straightforward assessment of measurable properties. By employing an estimation of distribution algorithm, KCM optimizes sequences on the basis of geometric, physicochemical and energetic criteria. We benchmarked its performance on α-helices, β-sheets, a combination of both and unstructured regions, demonstrating precise backbone geometry design. As a proof of concept, we applied KCM to antimicrobial peptide design by using a template antimicrobial peptide as the ‘key’, yielding a candidate with potent in vitro activity against multiple bacterial strains and efficacy in a murine infection model. KCM thus emerges as a robust tool for de novo protein and peptide design, offering a flexible paradigm for replicating and extending the structure–function relationships of existing templates. Powerful generative AI models for designing biological macromolecules are being developed, with applications in medicine, biotechnology and materials science, but these models are expensive to train and modify. Leyva et al. introduce the Key-Cutting Machine, an optimization-based platform for proteins and peptides that iteratively leverages structure prediction to match desired backbone geometries.
计算蛋白质和肽设计正在成为具有精确结构和功能的工程大分子的变革性框架,为医学,生物技术和材料科学提供创新解决方案。然而,目前的方法主要依赖于生成模型,这是昂贵的训练和修改。在这里,我们介绍了钥匙切割机(KCM),这是一个基于优化的平台,迭代地利用结构预测来匹配所需的骨干几何形状。KCM只需要一个图形处理单元,并且能够无缝地将用户定义的需求整合到目标函数中,避免了生成模型的高再训练成本,同时允许直接评估可测量的属性。KCM采用分布估计算法,基于几何、物理化学和能量准则对序列进行优化。我们在α-螺旋、β-薄片、两者的组合和非结构化区域上对其性能进行了基准测试,展示了精确的骨干几何设计。作为概念验证,我们将KCM应用于抗菌肽设计,使用模板抗菌肽作为“关键”,产生了对多种细菌菌株具有强效体外活性的候选物,并在小鼠感染模型中有效。因此,KCM作为从头开始的蛋白质和肽设计的强大工具,为复制和扩展现有模板的结构-功能关系提供了灵活的范例。用于设计生物大分子的强大生成式人工智能模型正在开发中,应用于医学、生物技术和材料科学,但这些模型的训练和修改成本很高。Leyva等人介绍了keycutting Machine,这是一个基于优化的蛋白质和肽平台,迭代地利用结构预测来匹配所需的骨干几何形状。
{"title":"Tailored structured peptide design with a key-cutting machine approach","authors":"Yan C. Leyva, Marcelo D. T. Torres, Carlos A. Oliva, Cesar de la Fuente-Nunez, Carlos A. Brizuela","doi":"10.1038/s42256-025-01119-2","DOIUrl":"10.1038/s42256-025-01119-2","url":null,"abstract":"Computational protein and peptide design is emerging as a transformative framework for engineering macromolecules with precise structures and functions, offering innovative solutions in medicine, biotechnology and materials science. However, current methods predominantly rely on generative models, which are expensive to train and modify. Here, we introduce the Key-Cutting Machine (KCM), an optimization-based platform that iteratively leverages structure prediction to match desired backbone geometries. KCM requires only a single graphics processing unit and enables seamless incorporation of user-defined requirements into the objective function, circumventing the high retraining costs typical of generative models while allowing straightforward assessment of measurable properties. By employing an estimation of distribution algorithm, KCM optimizes sequences on the basis of geometric, physicochemical and energetic criteria. We benchmarked its performance on α-helices, β-sheets, a combination of both and unstructured regions, demonstrating precise backbone geometry design. As a proof of concept, we applied KCM to antimicrobial peptide design by using a template antimicrobial peptide as the ‘key’, yielding a candidate with potent in vitro activity against multiple bacterial strains and efficacy in a murine infection model. KCM thus emerges as a robust tool for de novo protein and peptide design, offering a flexible paradigm for replicating and extending the structure–function relationships of existing templates. Powerful generative AI models for designing biological macromolecules are being developed, with applications in medicine, biotechnology and materials science, but these models are expensive to train and modify. Leyva et al. introduce the Key-Cutting Machine, an optimization-based platform for proteins and peptides that iteratively leverages structure prediction to match desired backbone geometries.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1685-1697"},"PeriodicalIF":23.9,"publicationDate":"2025-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01119-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overcoming classic challenges for artificial neural networks by providing incentives and practice 通过提供激励和实践来克服人工神经网络的经典挑战
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-20 DOI: 10.1038/s42256-025-01121-8
Kazuki Irie, Brenden M. Lake
Since the earliest proposals for artificial neural network models of the mind and brain, critics have pointed out key weaknesses in these models compared with human cognitive abilities. Here we review recent work that uses metalearning to overcome several classic challenges, which we characterize as addressing the problem of incentive and practice—that is, providing machines with both incentives to improve specific skills and opportunities to practice those skills. This explicit optimization contrasts with more conventional approaches that hope that the desired behaviour will emerge through optimizing related but different objectives. We review applications of this principle to address four classic challenges for artificial neural networks: systematic generalization, catastrophic forgetting, few-shot learning and multi-step reasoning. We also discuss how large language models incorporate key aspects of this metalearning framework (namely, sequence prediction with feedback trained on diverse data), which helps to explain some of their successes on these classic challenges. Finally, we discuss the prospects for understanding aspects of human development through this framework, and whether natural environments provide the right incentives and practice for learning how to make challenging generalizations. Irie and Lake present a metalearning framework that enables artificial neural networks to address classic challenges by providing both incentives to improve specific capabilities and opportunities to practice them.
自从最早提出心灵和大脑的人工神经网络模型以来,批评人士就指出了这些模型与人类认知能力相比的主要弱点。在这里,我们回顾了最近使用元学习来克服几个经典挑战的工作,我们将其描述为解决激励和实践的问题,即为机器提供提高特定技能的激励和实践这些技能的机会。这种显式优化与更传统的方法形成对比,后者希望通过优化相关但不同的目标来实现期望的行为。我们回顾了这一原理的应用,以解决人工神经网络的四个经典挑战:系统泛化、灾难性遗忘、少镜头学习和多步骤推理。我们还讨论了大型语言模型如何结合元学习框架的关键方面(即,在不同数据上训练的反馈序列预测),这有助于解释它们在这些经典挑战上的一些成功。最后,我们讨论了通过这一框架理解人类发展各个方面的前景,以及自然环境是否为学习如何做出具有挑战性的概括提供了正确的激励和实践。Irie和Lake提出了一个元学习框架,通过提供提高特定能力的激励和练习这些能力的机会,使人工神经网络能够解决经典挑战。
{"title":"Overcoming classic challenges for artificial neural networks by providing incentives and practice","authors":"Kazuki Irie, Brenden M. Lake","doi":"10.1038/s42256-025-01121-8","DOIUrl":"10.1038/s42256-025-01121-8","url":null,"abstract":"Since the earliest proposals for artificial neural network models of the mind and brain, critics have pointed out key weaknesses in these models compared with human cognitive abilities. Here we review recent work that uses metalearning to overcome several classic challenges, which we characterize as addressing the problem of incentive and practice—that is, providing machines with both incentives to improve specific skills and opportunities to practice those skills. This explicit optimization contrasts with more conventional approaches that hope that the desired behaviour will emerge through optimizing related but different objectives. We review applications of this principle to address four classic challenges for artificial neural networks: systematic generalization, catastrophic forgetting, few-shot learning and multi-step reasoning. We also discuss how large language models incorporate key aspects of this metalearning framework (namely, sequence prediction with feedback trained on diverse data), which helps to explain some of their successes on these classic challenges. Finally, we discuss the prospects for understanding aspects of human development through this framework, and whether natural environments provide the right incentives and practice for learning how to make challenging generalizations. Irie and Lake present a metalearning framework that enables artificial neural networks to address classic challenges by providing both incentives to improve specific capabilities and opportunities to practice them.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1602-1611"},"PeriodicalIF":23.9,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-unit activations confer inductive biases for emergent circuit solutions to cognitive tasks 单单元激活赋予了认知任务的紧急电路解决方案的归纳偏差
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-20 DOI: 10.1038/s42256-025-01127-2
Pavel Tolmachev, Tatiana A. Engel
Trained recurrent neural networks (RNNs) have become the leading framework for modelling neural dynamics in the brain, owing to their capacity to mimic how population-level computations arise from interactions among many units with heterogeneous responses. RNN units are commonly modelled using various nonlinear activation functions, assuming these architectural differences do not affect emerging task solutions. Here, contrary to this view, we show that single-unit activation functions confer inductive biases that influence the geometry of neural population trajectories, single-unit selectivity and fixed-point configurations. Using a model distillation approach, we find that differences in neural representations and dynamics reflect qualitatively distinct circuit solutions to cognitive tasks emerging in RNNs with different activation functions, leading to disparate generalization behaviour on out-of-distribution inputs. Our results show that seemingly minor architectural differences provide strong inductive biases for task solutions, raising a question about which RNN architectures better align with mechanisms of task execution in biological networks. Recurrent neural networks are widely used to model brain dynamics. Tolmachev and Engel show that single-unit activation functions influence task solutions that emerge in trained networks, raising the question of which design choices best align with biology.
训练的递归神经网络(RNNs)已经成为模拟大脑神经动力学的主要框架,因为它们能够模拟许多具有异质反应的单元之间的相互作用如何产生种群水平的计算。RNN单元通常使用各种非线性激活函数建模,假设这些架构差异不会影响新出现的任务解决方案。在这里,与这种观点相反,我们表明单单元激活函数赋予归纳偏差,影响神经种群轨迹的几何形状,单单元选择性和定点配置。使用模型蒸馏方法,我们发现神经表征和动态的差异反映了具有不同激活函数的rnn中出现的认知任务的定性不同电路解决方案,导致分布外输入的不同泛化行为。我们的研究结果表明,看似微小的架构差异为任务解决方案提供了强大的归纳偏差,这就提出了一个问题,即哪种RNN架构更符合生物网络中的任务执行机制。递归神经网络被广泛用于模拟大脑动力学。托尔马切夫和恩格尔表明,单单元激活函数会影响训练网络中出现的任务解决方案,这就提出了哪个设计选择最符合生物学的问题。
{"title":"Single-unit activations confer inductive biases for emergent circuit solutions to cognitive tasks","authors":"Pavel Tolmachev, Tatiana A. Engel","doi":"10.1038/s42256-025-01127-2","DOIUrl":"10.1038/s42256-025-01127-2","url":null,"abstract":"Trained recurrent neural networks (RNNs) have become the leading framework for modelling neural dynamics in the brain, owing to their capacity to mimic how population-level computations arise from interactions among many units with heterogeneous responses. RNN units are commonly modelled using various nonlinear activation functions, assuming these architectural differences do not affect emerging task solutions. Here, contrary to this view, we show that single-unit activation functions confer inductive biases that influence the geometry of neural population trajectories, single-unit selectivity and fixed-point configurations. Using a model distillation approach, we find that differences in neural representations and dynamics reflect qualitatively distinct circuit solutions to cognitive tasks emerging in RNNs with different activation functions, leading to disparate generalization behaviour on out-of-distribution inputs. Our results show that seemingly minor architectural differences provide strong inductive biases for task solutions, raising a question about which RNN architectures better align with mechanisms of task execution in biological networks. Recurrent neural networks are widely used to model brain dynamics. Tolmachev and Engel show that single-unit activation functions influence task solutions that emerge in trained networks, raising the question of which design choices best align with biology.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1742-1754"},"PeriodicalIF":23.9,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01127-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundation models’ acceptable use policies disregard the environment and nature 基础模型的可接受使用政策忽视了环境和自然
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-16 DOI: 10.1038/s42256-025-01134-3
Björn Ekström, Lisa Engström, Jutta Haider
{"title":"Foundation models’ acceptable use policies disregard the environment and nature","authors":"Björn Ekström, Lisa Engström, Jutta Haider","doi":"10.1038/s42256-025-01134-3","DOIUrl":"10.1038/s42256-025-01134-3","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1771-1772"},"PeriodicalIF":23.9,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145547310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A cautiously optimistic view on large language model use against linguistic injustice in academia 谨慎乐观地看待学界大语言模型的使用与语言不公
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-16 DOI: 10.1038/s42256-025-01138-z
Jon Rueda, Viktor Ivanković, Charlie Blunden
{"title":"A cautiously optimistic view on large language model use against linguistic injustice in academia","authors":"Jon Rueda, Viktor Ivanković, Charlie Blunden","doi":"10.1038/s42256-025-01138-z","DOIUrl":"10.1038/s42256-025-01138-z","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1773-1774"},"PeriodicalIF":23.9,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145547311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flow matching for accelerated simulation of atomic transport in crystalline materials 晶体材料中原子输运加速模拟的流动匹配
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-16 DOI: 10.1038/s42256-025-01125-4
Juno Nam, Sulin Liu, Gavin Winter, KyuJung Jun, Soojung Yang, Rafael Gómez-Bombarelli
Atomic transport underpins the performance of materials in technologies such as energy storage and electronics, yet its simulation remains computationally demanding. In particular, modelling ionic diffusion in solid-state electrolytes requires methods that can overcome the scale limitations of traditional ab initio molecular dynamics. We introduce LiFlow, a generative framework to accelerate MD simulations for crystalline materials that formulates the task as the conditional generation of atomic displacements. The model uses flow matching, with a Propagator submodel to generate atomic displacements and a Corrector to locally correct unphysical geometries, and incorporates an adaptive prior based on the Maxwell–Boltzmann distribution to account for chemical and thermal conditions. We benchmark LiFlow on a dataset comprising 25-ps trajectories of lithium diffusion across 4,186 solid-state electrolyte candidates at four temperatures. The model obtains a consistent Spearman rank correlation of 0.7–0.8 for lithium mean squared displacement predictions on unseen compositions. Furthermore, LiFlow generalizes from short training trajectories to larger supercells and longer simulations and maintains high accuracy. With speed-ups of up to 600,000× compared with first-principles methods, LiFlow enables scalable simulations at significantly larger length scales and timescales. A generative framework that accelerates the simulations of atomic transport in crystalline solids is developed, enabling large-scale screening and extending simulations to larger spatiotemporal scales for energy storage materials.
原子输运是能量存储和电子等技术中材料性能的基础,但其模拟仍然需要计算。特别是,模拟固态电解质中的离子扩散需要能够克服传统从头算分子动力学的尺度限制的方法。我们介绍了LiFlow,这是一个生成框架,用于加速晶体材料的MD模拟,该框架将任务表述为原子位移的条件生成。该模型使用流动匹配,使用传播子模型生成原子位移,使用校正子模型局部校正非物理几何形状,并结合基于麦克斯韦-玻尔兹曼分布的自适应先验来考虑化学和热条件。我们在一个数据集上对LiFlow进行了基准测试,该数据集包括在4种温度下4,186种固态电解质候选物中锂扩散的25ps轨迹。该模型对未见成分的锂均方位移预测获得了一致的0.7-0.8的Spearman秩相关。此外,LiFlow从短期训练轨迹推广到更大的超级单元格和更长的模拟,并保持较高的准确性。与第一性原理方法相比,LiFlow的加速高达60万倍,可以在更大的长度尺度和时间尺度上进行可扩展的模拟。开发了一个生成框架,加速了晶体固体中原子输运的模拟,使大规模筛选和扩展模拟到更大的时空尺度的储能材料。
{"title":"Flow matching for accelerated simulation of atomic transport in crystalline materials","authors":"Juno Nam, Sulin Liu, Gavin Winter, KyuJung Jun, Soojung Yang, Rafael Gómez-Bombarelli","doi":"10.1038/s42256-025-01125-4","DOIUrl":"10.1038/s42256-025-01125-4","url":null,"abstract":"Atomic transport underpins the performance of materials in technologies such as energy storage and electronics, yet its simulation remains computationally demanding. In particular, modelling ionic diffusion in solid-state electrolytes requires methods that can overcome the scale limitations of traditional ab initio molecular dynamics. We introduce LiFlow, a generative framework to accelerate MD simulations for crystalline materials that formulates the task as the conditional generation of atomic displacements. The model uses flow matching, with a Propagator submodel to generate atomic displacements and a Corrector to locally correct unphysical geometries, and incorporates an adaptive prior based on the Maxwell–Boltzmann distribution to account for chemical and thermal conditions. We benchmark LiFlow on a dataset comprising 25-ps trajectories of lithium diffusion across 4,186 solid-state electrolyte candidates at four temperatures. The model obtains a consistent Spearman rank correlation of 0.7–0.8 for lithium mean squared displacement predictions on unseen compositions. Furthermore, LiFlow generalizes from short training trajectories to larger supercells and longer simulations and maintains high accuracy. With speed-ups of up to 600,000× compared with first-principles methods, LiFlow enables scalable simulations at significantly larger length scales and timescales. A generative framework that accelerates the simulations of atomic transport in crystalline solids is developed, enabling large-scale screening and extending simulations to larger spatiotemporal scales for energy storage materials.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1625-1635"},"PeriodicalIF":23.9,"publicationDate":"2025-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Nature Machine Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1