首页 > 最新文献

Nature Machine Intelligence最新文献

英文 中文
Pseudodata-based molecular structure generator to reveal unknown chemicals 伪数据为基础的分子结构生成器揭示未知的化学物质
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-14 DOI: 10.1038/s42256-025-01140-5
Nanyang Yu, Zheng Ma, Qi Shao, Laihui Li, Xuebing Wang, Bingcai Pan, Hongxia Yu, Si Wei
Translating mass spectra into chemical structures is a central challenge in exposomics, making it difficult to quickly track the millions of chemicals found in humans and the environment. Unlike metabolomics, key problems in developing models for chemicals with a larger molecular space include data scarcity, model complexity and proper query strategy. Here we present a molecular structure generator (MSGo) that can generate structures directly from mass spectra and discover unknown polyfluorinated chemicals in the exposome. Trained with only virtual spectra using a transformer neural network, MSGo correctly identified 48% of structures in a validation set and was better at discovering new polyfluorinated chemicals in wastewater samples reported in the literature than experts. Applying probability-oriented masking to the virtual spectra is key to MSGo’s performance. Rapid discovery of chemicals with limited experimental mass spectral data using automated tools such as MSGo is key to tackling the current unknown polyfluorinated chemical crisis. Yu and colleagues present MSGo, an artificial intelligence exposomics tool trained on virtual mass spectra with masking that identifies pollutants by generating chemical structures that match measured spectral data.
将质谱转化为化学结构是暴露学的核心挑战,这使得快速追踪人类和环境中发现的数百万种化学物质变得困难。与代谢组学不同,开发具有更大分子空间的化学物质模型的关键问题包括数据稀缺、模型复杂性和适当的查询策略。本文介绍了一种分子结构发生器(MSGo),它可以直接从质谱中产生结构,并在暴露物中发现未知的多氟化学物质。仅使用变压器神经网络进行虚拟光谱训练,MSGo在验证集中正确识别了48%的结构,并且在发现文献中报道的废水样品中新的多氟化化学物质方面比专家更好。对虚谱应用面向概率的掩模是提高MSGo性能的关键。使用自动化工具(如MSGo)在有限的实验质谱数据下快速发现化学品是解决当前未知多氟化化学品危机的关键。Yu及其同事介绍了MSGo,这是一种人工智能暴露学工具,通过屏蔽虚拟质谱进行训练,通过生成与测量光谱数据相匹配的化学结构来识别污染物。
{"title":"Pseudodata-based molecular structure generator to reveal unknown chemicals","authors":"Nanyang Yu, Zheng Ma, Qi Shao, Laihui Li, Xuebing Wang, Bingcai Pan, Hongxia Yu, Si Wei","doi":"10.1038/s42256-025-01140-5","DOIUrl":"10.1038/s42256-025-01140-5","url":null,"abstract":"Translating mass spectra into chemical structures is a central challenge in exposomics, making it difficult to quickly track the millions of chemicals found in humans and the environment. Unlike metabolomics, key problems in developing models for chemicals with a larger molecular space include data scarcity, model complexity and proper query strategy. Here we present a molecular structure generator (MSGo) that can generate structures directly from mass spectra and discover unknown polyfluorinated chemicals in the exposome. Trained with only virtual spectra using a transformer neural network, MSGo correctly identified 48% of structures in a validation set and was better at discovering new polyfluorinated chemicals in wastewater samples reported in the literature than experts. Applying probability-oriented masking to the virtual spectra is key to MSGo’s performance. Rapid discovery of chemicals with limited experimental mass spectral data using automated tools such as MSGo is key to tackling the current unknown polyfluorinated chemical crisis. Yu and colleagues present MSGo, an artificial intelligence exposomics tool trained on virtual mass spectra with masking that identifies pollutants by generating chemical structures that match measured spectral data.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1879-1887"},"PeriodicalIF":23.9,"publicationDate":"2025-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145509633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large language models still struggle with false beliefs 大型语言模型仍在与错误信念作斗争
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-13 DOI: 10.1038/s42256-025-01145-0
Kristian Kersting
A new benchmark, KaBLE (knowledge and belief language evaluation), indicates that some large language models are unable to accurately distinguish belief from knowledge and fact, calling into question their use in real-word applications such as medicine and law.
一个新的基准,able(知识和信念语言评估)表明,一些大型语言模型无法准确区分信念、知识和事实,这使得它们在现实世界中的应用(如医学和法律)受到质疑。
{"title":"Large language models still struggle with false beliefs","authors":"Kristian Kersting","doi":"10.1038/s42256-025-01145-0","DOIUrl":"10.1038/s42256-025-01145-0","url":null,"abstract":"A new benchmark, KaBLE (knowledge and belief language evaluation), indicates that some large language models are unable to accurately distinguish belief from knowledge and fact, calling into question their use in real-word applications such as medicine and law.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1778-1779"},"PeriodicalIF":23.9,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145498438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Solving sparse finite element problems on neuromorphic hardware 求解神经形态硬件稀疏有限元问题
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-13 DOI: 10.1038/s42256-025-01143-2
Bradley H. Theilman, James B. Aimone
The finite element method (FEM) is one of the most important and ubiquitous numerical methods for solving partial differential equations (PDEs) on computers for scientific and engineering discovery. Applying the FEM to larger and more detailed scientific models has driven advances in high-performance computing for decades. Here we demonstrate that scalable spiking neuromorphic hardware can directly implement the FEM by constructing a spiking neural network that solves the large, sparse, linear systems of equations at the core of the FEM. We show that for the Poisson equation, a fundamental PDE in science and engineering, our neural circuit achieves meaningful levels of numerical accuracy and close to ideal scaling on modern, inherently parallel and energy-efficient neuromorphic hardware, specifically Intel’s Loihi 2 neuromorphic platform. We illustrate extensions to irregular mesh geometries in both two and three dimensions as well as other PDEs such as linear elasticity. Our spiking neural network is constructed from a recurrent network model of the brain’s motor cortex and, in contrast to black-box deep artificial neural network-based methods for PDEs, directly translates the well-understood and trusted mathematics of the FEM to a natively spiking neuromorphic algorithm. Theilman and Aimone introduce a natively spiking algorithm for solving partial differential equations on large-scale neuromorphic computers and demonstrate the algorithm on Intel’s Loihi 2 neuromorphic research chip.
有限元法(FEM)是在科学和工程发现的计算机上求解偏微分方程(PDEs)的最重要和最普遍的数值方法之一。几十年来,将FEM应用于更大、更详细的科学模型已经推动了高性能计算的进步。在这里,我们证明了可扩展的尖峰神经形态硬件可以通过构建一个尖峰神经网络来直接实现FEM,该网络可以解决FEM核心的大型、稀疏、线性方程组。我们表明,对于泊松方程(科学和工程中的基本PDE),我们的神经回路在现代、内在并行和节能的神经形态硬件(特别是英特尔的Loihi 2神经形态平台)上达到了有意义的数值精度水平,并接近理想的缩放。我们说明了扩展到不规则网格几何在二维和三维以及其他偏微分方程,如线性弹性。我们的脉冲神经网络是由大脑运动皮层的循环网络模型构建而成的,与基于黑箱深度人工神经网络的pde方法不同,它直接将FEM的良好理解和可信的数学转化为原生的脉冲神经形态算法。
{"title":"Solving sparse finite element problems on neuromorphic hardware","authors":"Bradley H. Theilman, James B. Aimone","doi":"10.1038/s42256-025-01143-2","DOIUrl":"10.1038/s42256-025-01143-2","url":null,"abstract":"The finite element method (FEM) is one of the most important and ubiquitous numerical methods for solving partial differential equations (PDEs) on computers for scientific and engineering discovery. Applying the FEM to larger and more detailed scientific models has driven advances in high-performance computing for decades. Here we demonstrate that scalable spiking neuromorphic hardware can directly implement the FEM by constructing a spiking neural network that solves the large, sparse, linear systems of equations at the core of the FEM. We show that for the Poisson equation, a fundamental PDE in science and engineering, our neural circuit achieves meaningful levels of numerical accuracy and close to ideal scaling on modern, inherently parallel and energy-efficient neuromorphic hardware, specifically Intel’s Loihi 2 neuromorphic platform. We illustrate extensions to irregular mesh geometries in both two and three dimensions as well as other PDEs such as linear elasticity. Our spiking neural network is constructed from a recurrent network model of the brain’s motor cortex and, in contrast to black-box deep artificial neural network-based methods for PDEs, directly translates the well-understood and trusted mathematics of the FEM to a natively spiking neuromorphic algorithm. Theilman and Aimone introduce a natively spiking algorithm for solving partial differential equations on large-scale neuromorphic computers and demonstrate the algorithm on Intel’s Loihi 2 neuromorphic research chip.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1845-1857"},"PeriodicalIF":23.9,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01143-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145498437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
South Asian biases in language and vision models 语言和视觉模型中的南亚偏见
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-13 DOI: 10.1038/s42256-025-01144-1
Mohammad Nadeem, Shahab Saquib Sohail, Erik Cambria, Shagufta Afreen
Biases in artificial intelligence models have been studied predominantly through Western lenses, overlooking South Asia’s unique contexts of caste, religion, colourism and representation. This Comment highlights region-specific biases in language and vision models and calls for fairness frameworks grounded in South Asian realities.
人工智能模型中的偏见主要是通过西方视角来研究的,忽视了南亚种姓、宗教、肤色歧视和代表性等独特背景。本评论强调了语言和视觉模型中的区域特定偏见,并呼吁建立基于南亚现实的公平框架。
{"title":"South Asian biases in language and vision models","authors":"Mohammad Nadeem, Shahab Saquib Sohail, Erik Cambria, Shagufta Afreen","doi":"10.1038/s42256-025-01144-1","DOIUrl":"10.1038/s42256-025-01144-1","url":null,"abstract":"Biases in artificial intelligence models have been studied predominantly through Western lenses, overlooking South Asia’s unique contexts of caste, religion, colourism and representation. This Comment highlights region-specific biases in language and vision models and calls for fairness frameworks grounded in South Asian realities.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1775-1777"},"PeriodicalIF":23.9,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145498436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convolutional architectures are cortex-aligned de novo 卷积架构是从头对齐的
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-13 DOI: 10.1038/s42256-025-01142-3
Atlas Kazemian, Eric Elmoznino, Michael F. Bonner
What underlies the emergence of cortex-aligned representations in deep neural network models of vision? Earlier work suggested that shared architectural constraints were a major factor, but the success of widely varied architectures after pretraining raises critical questions about the importance of architectural constraints. Here we show that in wide networks with minimal training, architectural inductive biases have a prominent role. We examined networks with varied architectures but no pretraining and quantified their ability to predict image representations in the visual cortices of monkeys and humans. We found that cortex-aligned representations emerge in convolutional architectures that combine two key manipulations of dimensionality: compression in the spatial domain, through pooling, and expansion in the feature domain by increasing the number of channels. We further show that the inductive biases of convolutional architectures are critical for obtaining performance gains from feature expansion—dimensionality manipulations were relatively ineffective in other architectures and in convolutional models with targeted lesions. Our findings suggest that the architectural constraints of convolutional networks are sufficiently close to the constraints of biological vision to allow many aspects of cortical visual representation to emerge even before synaptic connections have been tuned through experience. Kazemian et al. report that untrained convolutional networks with wide layers predict primate visual cortex responses nearly as well as task-optimized networks, revealing how architectural constraints shape brain-like representations in deep networks.
在视觉的深度神经网络模型中,是什么导致了皮层对齐表征的出现?早期的工作表明,共享的体系结构约束是一个主要因素,但是在预训练之后,广泛变化的体系结构的成功提出了关于体系结构约束重要性的关键问题。在这里,我们表明,在具有最少训练的广泛网络中,架构归纳偏差具有突出的作用。我们研究了不同架构的网络,但没有进行预训练,并量化了它们在猴子和人类视觉皮层中预测图像表征的能力。我们发现,在结合了两种关键维数操作的卷积架构中出现了与上下文对齐的表示:通过池化在空间域中压缩,以及通过增加通道数量在特征域中扩展。我们进一步表明,卷积架构的归纳偏差对于从特征扩展中获得性能增益至关重要——在其他架构和具有目标病变的卷积模型中,维数操作相对无效。我们的研究结果表明,卷积网络的结构约束与生物视觉的约束足够接近,甚至在突触连接通过经验调整之前,皮层视觉表征的许多方面就已经出现了。Kazemian等人报告说,未经训练的宽层卷积网络预测灵长类动物视觉皮层的反应几乎与任务优化网络一样好,揭示了架构约束如何在深度网络中塑造类脑表征。
{"title":"Convolutional architectures are cortex-aligned de novo","authors":"Atlas Kazemian, Eric Elmoznino, Michael F. Bonner","doi":"10.1038/s42256-025-01142-3","DOIUrl":"10.1038/s42256-025-01142-3","url":null,"abstract":"What underlies the emergence of cortex-aligned representations in deep neural network models of vision? Earlier work suggested that shared architectural constraints were a major factor, but the success of widely varied architectures after pretraining raises critical questions about the importance of architectural constraints. Here we show that in wide networks with minimal training, architectural inductive biases have a prominent role. We examined networks with varied architectures but no pretraining and quantified their ability to predict image representations in the visual cortices of monkeys and humans. We found that cortex-aligned representations emerge in convolutional architectures that combine two key manipulations of dimensionality: compression in the spatial domain, through pooling, and expansion in the feature domain by increasing the number of channels. We further show that the inductive biases of convolutional architectures are critical for obtaining performance gains from feature expansion—dimensionality manipulations were relatively ineffective in other architectures and in convolutional models with targeted lesions. Our findings suggest that the architectural constraints of convolutional networks are sufficiently close to the constraints of biological vision to allow many aspects of cortical visual representation to emerge even before synaptic connections have been tuned through experience. Kazemian et al. report that untrained convolutional networks with wide layers predict primate visual cortex responses nearly as well as task-optimized networks, revealing how architectural constraints shape brain-like representations in deep networks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1834-1844"},"PeriodicalIF":23.9,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145498435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emulating human-like adaptive vision for efficient and flexible machine visual perception 模拟类人自适应视觉,实现高效灵活的机器视觉感知
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-06 DOI: 10.1038/s42256-025-01130-7
Yulin Wang  (, ), Yang Yue  (, ), Yang Yue  (, ), Huanqian Wang, Haojun Jiang, Yizeng Han, Zanlin Ni, Yifan Pu, Minglei Shi, Rui Lu, Qisen Yang, Andrew Zhao, Zhuofan Xia, Shiji Song, Gao Huang
Human vision is highly adaptive, efficiently sampling intricate environments by sequentially fixating on task-relevant regions. In contrast, prevailing machine vision models passively process entire scenes at once, resulting in excessive resource demands scaling with spatial–temporal input resolution and model size, yielding critical limitations impeding both future advancements and real-world application. Here we introduce AdaptiveNN, a general framework aiming to enable the transition from ‘passive’ to ‘active and adaptive’ vision models. AdaptiveNN formulates visual perception as a coarse-to-fine sequential decision-making process, progressively identifying and attending to regions pertinent to the task, incrementally combining information across fixations and actively concluding observation when sufficient. We establish a theory integrating representation learning with self-rewarding reinforcement learning, enabling end-to-end training of the non-differentiable AdaptiveNN without additional supervision on fixation locations. We assess AdaptiveNN on 17 benchmarks spanning 9 tasks, including large-scale visual recognition, fine-grained discrimination, visual search, processing images from real driving and medical scenarios, language-driven embodied artificial intelligence and side-by-side comparisons with humans. AdaptiveNN achieves up to 28 times inference cost reduction without sacrificing accuracy, flexibly adapts to varying task demands and resource budgets without retraining, and provides enhanced interpretability via its fixation patterns, demonstrating a promising avenue towards efficient, flexible and interpretable computer vision. Furthermore, AdaptiveNN exhibits closely human-like perceptual behaviours in many cases, revealing its potential as a valuable tool for investigating visual cognition. A deep learning approach, AdaptiveNN, shifts machine vision models from passive to active to mimic human-like perception. The method achieves inference costs that are up to 28-times lower without accuracy loss, while showcasing online-adaptable and interpretable behaviours.
人类的视觉具有高度的自适应能力,通过顺序地注视与任务相关的区域,有效地对复杂的环境进行采样。相比之下,主流的机器视觉模型被动地一次性处理整个场景,导致资源需求随着时空输入分辨率和模型大小的增加而增加,这对未来的发展和现实世界的应用产生了严重的限制。在这里,我们介绍AdaptiveNN,这是一个通用框架,旨在实现从“被动”到“主动和自适应”视觉模型的过渡。adaptivvenn将视觉感知描述为一个从粗到精的顺序决策过程,逐步识别和关注与任务相关的区域,逐步结合注视点之间的信息,并在足够的时候主动总结观察。我们建立了一个集成表征学习和自奖励强化学习的理论,实现了端到端不可微自适应神经网络的训练,而无需对固定位置进行额外的监督。我们在涵盖9项任务的17个基准上对AdaptiveNN进行了评估,包括大规模视觉识别、细粒度识别、视觉搜索、处理真实驾驶和医疗场景的图像、语言驱动的嵌入式人工智能以及与人类的并行比较。AdaptiveNN在不牺牲准确性的情况下实现了高达28倍的推理成本降低,灵活地适应不同的任务需求和资源预算,无需再培训,并通过其固定模式提供增强的可解释性,展示了高效,灵活和可解释的计算机视觉的有前途的途径。此外,adaptivvenn在许多情况下表现出与人类相似的感知行为,揭示了它作为研究视觉认知的有价值工具的潜力。一种深度学习方法AdaptiveNN将机器视觉模型从被动转变为主动,以模仿类似人类的感知。该方法在不损失精度的情况下实现了高达28倍的推理成本,同时展示了在线适应性和可解释的行为。
{"title":"Emulating human-like adaptive vision for efficient and flexible machine visual perception","authors":"Yulin Wang \u0000 (, ), Yang Yue \u0000 (, ), Yang Yue \u0000 (, ), Huanqian Wang, Haojun Jiang, Yizeng Han, Zanlin Ni, Yifan Pu, Minglei Shi, Rui Lu, Qisen Yang, Andrew Zhao, Zhuofan Xia, Shiji Song, Gao Huang","doi":"10.1038/s42256-025-01130-7","DOIUrl":"10.1038/s42256-025-01130-7","url":null,"abstract":"Human vision is highly adaptive, efficiently sampling intricate environments by sequentially fixating on task-relevant regions. In contrast, prevailing machine vision models passively process entire scenes at once, resulting in excessive resource demands scaling with spatial–temporal input resolution and model size, yielding critical limitations impeding both future advancements and real-world application. Here we introduce AdaptiveNN, a general framework aiming to enable the transition from ‘passive’ to ‘active and adaptive’ vision models. AdaptiveNN formulates visual perception as a coarse-to-fine sequential decision-making process, progressively identifying and attending to regions pertinent to the task, incrementally combining information across fixations and actively concluding observation when sufficient. We establish a theory integrating representation learning with self-rewarding reinforcement learning, enabling end-to-end training of the non-differentiable AdaptiveNN without additional supervision on fixation locations. We assess AdaptiveNN on 17 benchmarks spanning 9 tasks, including large-scale visual recognition, fine-grained discrimination, visual search, processing images from real driving and medical scenarios, language-driven embodied artificial intelligence and side-by-side comparisons with humans. AdaptiveNN achieves up to 28 times inference cost reduction without sacrificing accuracy, flexibly adapts to varying task demands and resource budgets without retraining, and provides enhanced interpretability via its fixation patterns, demonstrating a promising avenue towards efficient, flexible and interpretable computer vision. Furthermore, AdaptiveNN exhibits closely human-like perceptual behaviours in many cases, revealing its potential as a valuable tool for investigating visual cognition. A deep learning approach, AdaptiveNN, shifts machine vision models from passive to active to mimic human-like perception. The method achieves inference costs that are up to 28-times lower without accuracy loss, while showcasing online-adaptable and interpretable behaviours.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1804-1822"},"PeriodicalIF":23.9,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145447396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Densing law of LLMs 法学硕士密集定律
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-06 DOI: 10.1038/s42256-025-01137-0
Chaojun Xiao, Jie Cai, Weilin Zhao, Biyuan Lin, Guoyang Zeng, Jie Zhou, Zhi Zheng, Xu Han, Zhiyuan Liu, Maosong Sun
Large language models (LLMs) have emerged as a milestone in artificial intelligence. The scaling law indicates that the performance of LLMs can continually improve as the model size increases, which poses challenges for training and deployment. Despite numerous efforts to improve LLM efficiency, there is no general consensus on development trends and evaluation metrics for efficiency of LLMs with different scales. To address this tension between model performance and efficiency, we introduce the concept of capability density as a metric to evaluate the quality of the LLMs and describe the trend of LLMs in terms of both effectiveness and efficiency. Intuitively, capability density can be understood as the capability contained within each unit of model parameters. Capability density provides a unified framework for assessing both model performance and efficiency. Here we show an empirical observation, called the ‘densing law’, that the capability density of LLMs grows exponentially over time. More specifically, using widely used benchmarks for evaluation, the maximum capability density of open-source LLMs doubles approximately every 3.5 months. This reveals that both parameter requirements and inference costs of LLMs for achieving equivalent performance decrease exponentially, offering insights for efficient LLM development strategies. Xiao et al. introduce ‘capability density’, defined as capability per parameter, as a metric for evaluating large language models. They report an empirical trend, the ‘densing law’, which states that capability density doubles approximately every 3.5 months, indicating that equivalent model performance can be achieved with exponentially fewer parameters over time.
大型语言模型(llm)已经成为人工智能领域的一个里程碑。缩放定律表明llm的性能可以随着模型规模的增加而不断提高,这给训练和部署带来了挑战。尽管为提高法学硕士效率做出了许多努力,但对于不同规模法学硕士效率的发展趋势和评价指标尚未达成普遍共识。为了解决模型性能和效率之间的紧张关系,我们引入了能力密度的概念作为评估llm质量的度量,并从有效性和效率两个方面描述了llm的趋势。直观地说,能力密度可以理解为模型参数的每个单元所包含的能力。能力密度为评估模型性能和效率提供了一个统一的框架。在这里,我们展示了一个经验观察,称为“密度定律”,即llm的能力密度随时间呈指数增长。更具体地说,使用广泛使用的基准进行评估,开源llm的最大能力密度大约每3.5个月翻一番。这表明LLM实现等效性能的参数要求和推理成本都呈指数级下降,为高效的LLM开发策略提供了见解。
{"title":"Densing law of LLMs","authors":"Chaojun Xiao, Jie Cai, Weilin Zhao, Biyuan Lin, Guoyang Zeng, Jie Zhou, Zhi Zheng, Xu Han, Zhiyuan Liu, Maosong Sun","doi":"10.1038/s42256-025-01137-0","DOIUrl":"10.1038/s42256-025-01137-0","url":null,"abstract":"Large language models (LLMs) have emerged as a milestone in artificial intelligence. The scaling law indicates that the performance of LLMs can continually improve as the model size increases, which poses challenges for training and deployment. Despite numerous efforts to improve LLM efficiency, there is no general consensus on development trends and evaluation metrics for efficiency of LLMs with different scales. To address this tension between model performance and efficiency, we introduce the concept of capability density as a metric to evaluate the quality of the LLMs and describe the trend of LLMs in terms of both effectiveness and efficiency. Intuitively, capability density can be understood as the capability contained within each unit of model parameters. Capability density provides a unified framework for assessing both model performance and efficiency. Here we show an empirical observation, called the ‘densing law’, that the capability density of LLMs grows exponentially over time. More specifically, using widely used benchmarks for evaluation, the maximum capability density of open-source LLMs doubles approximately every 3.5 months. This reveals that both parameter requirements and inference costs of LLMs for achieving equivalent performance decrease exponentially, offering insights for efficient LLM development strategies. Xiao et al. introduce ‘capability density’, defined as capability per parameter, as a metric for evaluating large language models. They report an empirical trend, the ‘densing law’, which states that capability density doubles approximately every 3.5 months, indicating that equivalent model performance can be achieved with exponentially fewer parameters over time.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1823-1833"},"PeriodicalIF":23.9,"publicationDate":"2025-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01137-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145447372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Language models cannot reliably distinguish belief from knowledge and fact 语言模型不能可靠地区分信念、知识和事实
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-11-03 DOI: 10.1038/s42256-025-01113-8
Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, James Zou
As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial. Suzgun et al. find that current large language models cannot reliably distinguish between belief, knowledge and fact, raising concerns for their use in healthcare, law and journalism, where such distinctions are critical.
随着语言模型(LMs)越来越多地渗透到法律、医学、新闻和科学等高风险领域,它们区分信念与知识、事实与虚构的能力变得势在必行。如果不能做出这样的区分,可能会误导诊断,扭曲司法判断,并放大错误信息。在这里,我们使用13个认知任务中的13,000个问题的新的able基准来评估24个尖端的LMs。我们的发现揭示了关键的局限性。特别是,所有经过系统测试的模型都无法识别第一人称错误信念,gpt - 40的准确率从98.2%下降到64.4%,DeepSeek R1的准确率从90%以上下降到14.4%。此外,模型处理第三人称错误信念的准确率(新模型为95%,旧模型为79%)远高于第一人称错误信念(新模型为62.6%,旧模型为52.5%),这揭示了令人不安的归因偏见。我们还发现,虽然最近的模型在递归知识任务中表现出能力,但它们仍然依赖于不一致的推理策略,这表明表面的模式匹配而不是强大的认知理解。大多数模型缺乏对知识的活动本质的强大理解,即知识本质上需要真理。这些限制需要在高风险领域部署LMs之前进行紧急改进,在这些领域中,认知区分是至关重要的。Suzgun等人发现,目前的大型语言模型不能可靠地区分信念、知识和事实,这引起了人们对它们在医疗保健、法律和新闻领域的使用的担忧,这些领域的区分是至关重要的。
{"title":"Language models cannot reliably distinguish belief from knowledge and fact","authors":"Mirac Suzgun, Tayfun Gur, Federico Bianchi, Daniel E. Ho, Thomas Icard, Dan Jurafsky, James Zou","doi":"10.1038/s42256-025-01113-8","DOIUrl":"10.1038/s42256-025-01113-8","url":null,"abstract":"As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial. Suzgun et al. find that current large language models cannot reliably distinguish between belief, knowledge and fact, raising concerns for their use in healthcare, law and journalism, where such distinctions are critical.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1780-1790"},"PeriodicalIF":23.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145434427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating molecular dynamics by going with the flow 通过随波逐流加速分子动力学
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-24 DOI: 10.1038/s42256-025-01129-0
Ahmed Y. Ismail, Bradley A. A. Martin, Keith T. Butler
Molecular dynamics (MD) simulations are widely used for understanding atomic motion but require substantial computational time. In new research by Nam et al., a generative artificial intelligence framework is developed to accelerate the MD simulations for crystalline materials, by reframing the task as conditional generation of atomic displacement.
分子动力学(MD)模拟被广泛用于理解原子运动,但需要大量的计算时间。在Nam等人的新研究中,通过将任务重构为原子位移的条件生成,开发了一个生成式人工智能框架来加速晶体材料的MD模拟。
{"title":"Accelerating molecular dynamics by going with the flow","authors":"Ahmed Y. Ismail, Bradley A. A. Martin, Keith T. Butler","doi":"10.1038/s42256-025-01129-0","DOIUrl":"10.1038/s42256-025-01129-0","url":null,"abstract":"Molecular dynamics (MD) simulations are widely used for understanding atomic motion but require substantial computational time. In new research by Nam et al., a generative artificial intelligence framework is developed to accelerate the MD simulations for crystalline materials, by reframing the task as conditional generation of atomic displacement.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 10","pages":"1598-1599"},"PeriodicalIF":23.9,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An interaction-derived graph learning framework for scoring protein–peptide complexes 一个相互作用衍生的图学习框架,用于评分蛋白质-肽复合物
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-10-23 DOI: 10.1038/s42256-025-01136-1
Huanyu Tao, Xiaoyu Wang, Sheng-You Huang
Accurate prediction of protein–peptide interactions is critical for peptide drug discovery. However, due to the limited number of protein–peptide structures in the Protein Data Bank, it is challenging to train an accurate scoring function for protein–peptide interactions. Here, addressing this challenge, we propose an interaction-derived graph neural network model for scoring protein–peptide complexes, named GraphPep. GraphPep models protein–peptide interactions instead of traditional atoms or residues as graph nodes, and focuses on residue–residue contacts instead of a single peptide root mean square deviation in the loss function. Therefore, GraphPep can not only efficiently capture the most important protein–peptide interactions, but also mitigate the problem of limited training data. Moreover, the power of GraphPep is further enhanced by the ESM-2 protein language model. GraphPep is extensively evaluated on diverse decoy sets generated by various protein–peptide docking programs and AlphaFold, and is compared against state-of-the-art methods. The results demonstrate the accuracy and robustness of GraphPep. GraphPep presents an interaction-derived and protein language model-powered graph learning framework for robust scoring of protein–peptide complexes, substantially enhancing the binding mode prediction of protein–peptide docking.
准确预测蛋白-肽相互作用对肽药物的发现至关重要。然而,由于蛋白质数据库中蛋白质-肽结构的数量有限,因此很难训练出准确的蛋白质-肽相互作用评分函数。为了解决这一挑战,我们提出了一个相互作用衍生的图神经网络模型,用于评分蛋白质-肽复合物,命名为GraphPep。GraphPep以蛋白质-肽相互作用为模型,而不是传统的原子或残基作为图节点,并且关注残基-残基接触,而不是损失函数中单个肽的均方根偏差。因此,GraphPep不仅可以有效地捕获最重要的蛋白-肽相互作用,还可以缓解训练数据有限的问题。此外,ESM-2蛋白质语言模型进一步增强了GraphPep的功能。GraphPep在各种蛋白肽对接程序和AlphaFold生成的各种诱饵集上进行了广泛的评估,并与最先进的方法进行了比较。结果证明了GraphPep的准确性和鲁棒性。GraphPep提供了一个相互作用衍生的蛋白质语言模型驱动的图学习框架,用于蛋白质-肽复合物的鲁棒评分,大大增强了蛋白质-肽对接的结合模式预测。
{"title":"An interaction-derived graph learning framework for scoring protein–peptide complexes","authors":"Huanyu Tao, Xiaoyu Wang, Sheng-You Huang","doi":"10.1038/s42256-025-01136-1","DOIUrl":"10.1038/s42256-025-01136-1","url":null,"abstract":"Accurate prediction of protein–peptide interactions is critical for peptide drug discovery. However, due to the limited number of protein–peptide structures in the Protein Data Bank, it is challenging to train an accurate scoring function for protein–peptide interactions. Here, addressing this challenge, we propose an interaction-derived graph neural network model for scoring protein–peptide complexes, named GraphPep. GraphPep models protein–peptide interactions instead of traditional atoms or residues as graph nodes, and focuses on residue–residue contacts instead of a single peptide root mean square deviation in the loss function. Therefore, GraphPep can not only efficiently capture the most important protein–peptide interactions, but also mitigate the problem of limited training data. Moreover, the power of GraphPep is further enhanced by the ESM-2 protein language model. GraphPep is extensively evaluated on diverse decoy sets generated by various protein–peptide docking programs and AlphaFold, and is compared against state-of-the-art methods. The results demonstrate the accuracy and robustness of GraphPep. GraphPep presents an interaction-derived and protein language model-powered graph learning framework for robust scoring of protein–peptide complexes, substantially enhancing the binding mode prediction of protein–peptide docking.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 11","pages":"1858-1869"},"PeriodicalIF":23.9,"publicationDate":"2025-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145381720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Nature Machine Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1