首页 > 最新文献

Nature computational science最新文献

英文 中文
A complete photonic integrated neuron for nonlinear all-optical computing 用于非线性全光计算的完整光子集成神经元。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-12 DOI: 10.1038/s43588-025-00866-x
Tao Yan, Yanchen Guo, Tiankuang Zhou, Guocheng Shao, Shanglong Li, Ruqi Huang, Qionghai Dai, Lu Fang
The field of photonic neural networks has experienced substantial growth, driven by its potential to enable ultrafast artificial intelligence inference and address the escalating demand for computing speed and energy efficiency. However, realizing nonlinearity-complete all-optical neurons is still challenging, constraining the performance of photonic neural networks. Here we report a complete photonic integrated neuron (PIN) with spatiotemporal feature learning capabilities and reconfigurable structures for nonlinear all-optical computing. By interleaving the spatiotemporal dimension of photons and leveraging the Kerr effect, PIN performs high-order temporal convolution and all-optical nonlinear activation monolithically on a silicon-nitride photonic chip, achieving neuron completeness of weighted interconnects and nonlinearities. We develop the PIN chip system and demonstrate its remarkable performance in high-accuracy image classification and human motion generation. PIN enables ultrafast spatialtemporal processing with a latency as low as 240 ps, paving the way for advancing machine intelligence into the subnanosecond regime. This study reports a complete photonic neuron integrated on a silicon-nitride chip, enabling ultrafast all-optical computing with nonlinear multi-kernel convolution for image recognition and motion generation.
光子神经网络在实现超快速人工智能推理和解决对计算速度和能源效率不断增长的需求方面具有巨大的潜力,因此该领域经历了大幅增长。然而,实现非线性完全全光神经元仍然具有挑战性,这限制了光子神经网络的性能。本文报道了一种具有时空特征学习能力和可重构结构的完整光子集成神经元(PIN),用于非线性全光计算。通过交错光子的时空维度和利用克尔效应,PIN在氮化硅光子芯片上单片执行高阶时间卷积和全光非线性激活,实现加权互连和非线性的神经元完备性。我们开发了PIN芯片系统,并证明了其在高精度图像分类和人体运动生成方面的卓越性能。PIN实现了超快的时空处理,延迟低至240 ps,为将机器智能推进到亚纳秒级铺平了道路。
{"title":"A complete photonic integrated neuron for nonlinear all-optical computing","authors":"Tao Yan, Yanchen Guo, Tiankuang Zhou, Guocheng Shao, Shanglong Li, Ruqi Huang, Qionghai Dai, Lu Fang","doi":"10.1038/s43588-025-00866-x","DOIUrl":"10.1038/s43588-025-00866-x","url":null,"abstract":"The field of photonic neural networks has experienced substantial growth, driven by its potential to enable ultrafast artificial intelligence inference and address the escalating demand for computing speed and energy efficiency. However, realizing nonlinearity-complete all-optical neurons is still challenging, constraining the performance of photonic neural networks. Here we report a complete photonic integrated neuron (PIN) with spatiotemporal feature learning capabilities and reconfigurable structures for nonlinear all-optical computing. By interleaving the spatiotemporal dimension of photons and leveraging the Kerr effect, PIN performs high-order temporal convolution and all-optical nonlinear activation monolithically on a silicon-nitride photonic chip, achieving neuron completeness of weighted interconnects and nonlinearities. We develop the PIN chip system and demonstrate its remarkable performance in high-accuracy image classification and human motion generation. PIN enables ultrafast spatialtemporal processing with a latency as low as 240 ps, paving the way for advancing machine intelligence into the subnanosecond regime. This study reports a complete photonic neuron integrated on a silicon-nitride chip, enabling ultrafast all-optical computing with nonlinear multi-kernel convolution for image recognition and motion generation.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 12","pages":"1202-1213"},"PeriodicalIF":18.3,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145056517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Confidential computing for population-scale genome-wide association studies with SECRET-GWAS SECRET-GWAS用于群体规模全基因组关联研究的保密计算。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-12 DOI: 10.1038/s43588-025-00856-z
Jonah Rosenblum, Juechu Dong, Satish Narayanasamy
Genomic data from a single institution lacks global diversity representation, especially for rare variants and diseases. Confidential computing can enable collaborative genome-wide association studies (GWAS) without compromising privacy or accuracy. However, due to limited secure memory space and performance overheads, previous solutions fail to support widely used regression methods. Here we present SECRET-GWAS—a rapid, privacy-preserving, population-scale, collaborative GWAS tool. We discuss several system optimizations, including streaming, batching, data parallelization and reducing trusted hardware overheads to efficiently scale linear and logistic regression to over a thousand processor cores on an Intel SGX-based cloud platform. In addition, we protect SECRET-GWAS against several hardware side-channel attacks. SECRET-GWAS is an open-source tool and works with the widely used Hail genomic analysis framework. Our experiments on Azure’s Confidential Computing platform demonstrate that SECRET-GWAS enables multivariate linear and logistic regression GWAS queries on population-scale datasets from ten independent sources in just 4.5 and 29 minutes, respectively. Secure collaborative genome-wide association studies (GWAS) with population-scale datasets address gaps in genomic data. This work proposes SECRET-GWAS and system optimizations that overcome resource constraints and exploit parallelism, while maintaining privacy and accuracy.
来自单一机构的基因组数据缺乏全球多样性代表,特别是对于罕见变异和疾病。保密计算可以使协作性全基因组关联研究(GWAS)在不损害隐私或准确性的情况下实现。但是,由于有限的安全内存空间和性能开销,以前的解决方案无法支持广泛使用的回归方法。在这里,我们提出了secret -GWAS-一个快速,隐私保护,人口规模,协作的GWAS工具。我们讨论了几个系统优化,包括流、批处理、数据并行化和减少可信硬件开销,以便在基于Intel sgx的云平台上有效地将线性和逻辑回归扩展到超过1000个处理器内核。此外,我们还保护SECRET-GWAS免受几种硬件侧信道攻击。SECRET-GWAS是一个开源工具,与广泛使用的Hail基因组分析框架一起工作。我们在Azure的机密计算平台上的实验表明,SECRET-GWAS可以在4.5分钟和29分钟内分别对来自10个独立来源的人口规模数据集进行多元线性和逻辑回归GWAS查询。
{"title":"Confidential computing for population-scale genome-wide association studies with SECRET-GWAS","authors":"Jonah Rosenblum, Juechu Dong, Satish Narayanasamy","doi":"10.1038/s43588-025-00856-z","DOIUrl":"10.1038/s43588-025-00856-z","url":null,"abstract":"Genomic data from a single institution lacks global diversity representation, especially for rare variants and diseases. Confidential computing can enable collaborative genome-wide association studies (GWAS) without compromising privacy or accuracy. However, due to limited secure memory space and performance overheads, previous solutions fail to support widely used regression methods. Here we present SECRET-GWAS—a rapid, privacy-preserving, population-scale, collaborative GWAS tool. We discuss several system optimizations, including streaming, batching, data parallelization and reducing trusted hardware overheads to efficiently scale linear and logistic regression to over a thousand processor cores on an Intel SGX-based cloud platform. In addition, we protect SECRET-GWAS against several hardware side-channel attacks. SECRET-GWAS is an open-source tool and works with the widely used Hail genomic analysis framework. Our experiments on Azure’s Confidential Computing platform demonstrate that SECRET-GWAS enables multivariate linear and logistic regression GWAS queries on population-scale datasets from ten independent sources in just 4.5 and 29 minutes, respectively. Secure collaborative genome-wide association studies (GWAS) with population-scale datasets address gaps in genomic data. This work proposes SECRET-GWAS and system optimizations that overcome resource constraints and exploit parallelism, while maintaining privacy and accuracy.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"825-835"},"PeriodicalIF":18.3,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145056513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision language models excel at perception but struggles with scientific reasoning 视觉语言模型擅长感知,但难以进行科学推理。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-10 DOI: 10.1038/s43588-025-00871-0
A benchmark — MaCBench — is developed for evaluating the scientific knowledge of vision language models (VLMs). Evaluation of leading VLMs reveals that they excel at basic scientific tasks such as equipment identification, but struggle with spatial reasoning and multistep analysis — a limitation for autonomous scientific discovery.
开发了一个用于评估视觉语言模型(VLMs)科学知识的基准- MaCBench。对领先的vlm的评估表明,它们擅长基本的科学任务,如设备识别,但在空间推理和多步骤分析方面存在困难——这是自主科学发现的一个限制。
{"title":"Vision language models excel at perception but struggles with scientific reasoning","authors":"","doi":"10.1038/s43588-025-00871-0","DOIUrl":"10.1038/s43588-025-00871-0","url":null,"abstract":"A benchmark — MaCBench — is developed for evaluating the scientific knowledge of vision language models (VLMs). Evaluation of leading VLMs reveals that they excel at basic scientific tasks such as equipment identification, but struggle with spatial reasoning and multistep analysis — a limitation for autonomous scientific discovery.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 10","pages":"852-853"},"PeriodicalIF":18.3,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145034850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SurFF: a foundation model for surface exposure and morphology across intermetallic crystals SurFF:金属间晶体表面暴露和形态的基础模型。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-09 DOI: 10.1038/s43588-025-00839-0
Jun Yin  (, ), Honghao Chen  (, ), Jiangjie Qiu  (, ), Wentao Li  (, ), Peng He  (, ), Jiali Li  (, ), Iftekhar A. Karimi, Xiaocheng Lan  (, ), Tiefeng Wang  (, ), Xiaonan Wang  (, )
With approximately 90% of industrial reactions occurring on surfaces, the role of heterogeneous catalysts is paramount. Currently, accurate surface exposure prediction is vital for heterogeneous catalyst design, but it is hindered by the high costs of experimental and computational methods. Here we introduce a foundation force-field-based model for predicting surface exposure and synthesizability (SurFF) across intermetallic crystals, which are essential materials for heterogeneous catalysts. We created a comprehensive intermetallic surface database using an active learning method and high-throughput density functional theory calculations, encompassing 12,553 unique surfaces and 344,200 single points. SurFF achieves density-functional-theory-level precision with a prediction error of 3 meV Å−2 and enables large-scale surface exposure prediction with a 105-fold acceleration. Validation against computational and experimental data both show strong alignment. We applied SurFF for large-scale predictions of surface energy and Wulff shapes for over 6,000 intermetallic crystals, providing valuable data for the community. A foundation machine learning model, SurFF, enables DFT-accurate predictions of surface energies and morphologies in intermetallic catalysts, achieving over 105-fold acceleration for high-throughput materials screening.
大约90%的工业反应发生在表面上,多相催化剂的作用是至关重要的。目前,准确的表面暴露预测对多相催化剂的设计至关重要,但实验和计算方法的高成本阻碍了这一点。本文介绍了一种基于基础力场的模型,用于预测金属间晶体的表面暴露和合成能力(SurFF),金属间晶体是制备非均相催化剂的必要材料。我们使用主动学习方法和高通量密度泛函理论计算创建了一个综合的金属间表面数据库,包含12,553个独特的表面和344,200个单点。SurFF达到了密度泛函数理论级的精度,预测误差为3 meV Å-2,并能够以105倍的加速度进行大规模的表面暴露预测。对计算和实验数据的验证都显示出很强的一致性。我们将SurFF应用于6000多个金属间晶体的表面能和Wulff形状的大规模预测,为社区提供了有价值的数据。
{"title":"SurFF: a foundation model for surface exposure and morphology across intermetallic crystals","authors":"Jun Yin \u0000 (, ), Honghao Chen \u0000 (, ), Jiangjie Qiu \u0000 (, ), Wentao Li \u0000 (, ), Peng He \u0000 (, ), Jiali Li \u0000 (, ), Iftekhar A. Karimi, Xiaocheng Lan \u0000 (, ), Tiefeng Wang \u0000 (, ), Xiaonan Wang \u0000 (, )","doi":"10.1038/s43588-025-00839-0","DOIUrl":"10.1038/s43588-025-00839-0","url":null,"abstract":"With approximately 90% of industrial reactions occurring on surfaces, the role of heterogeneous catalysts is paramount. Currently, accurate surface exposure prediction is vital for heterogeneous catalyst design, but it is hindered by the high costs of experimental and computational methods. Here we introduce a foundation force-field-based model for predicting surface exposure and synthesizability (SurFF) across intermetallic crystals, which are essential materials for heterogeneous catalysts. We created a comprehensive intermetallic surface database using an active learning method and high-throughput density functional theory calculations, encompassing 12,553 unique surfaces and 344,200 single points. SurFF achieves density-functional-theory-level precision with a prediction error of 3 meV Å−2 and enables large-scale surface exposure prediction with a 105-fold acceleration. Validation against computational and experimental data both show strong alignment. We applied SurFF for large-scale predictions of surface energy and Wulff shapes for over 6,000 intermetallic crystals, providing valuable data for the community. A foundation machine learning model, SurFF, enables DFT-accurate predictions of surface energies and morphologies in intermetallic catalysts, achieving over 105-fold acceleration for high-throughput materials screening.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"782-792"},"PeriodicalIF":18.3,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using LLMs to advance the cognitive science of collectives 利用法学硕士推进集体认知科学。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-09 DOI: 10.1038/s43588-025-00848-z
Ilia Sucholutsky, Katherine M. Collins, Nori Jacoby, Bill D. Thompson, Robert D. Hawkins
Large language models (LLMs) are already transforming the study of individual cognition, but their application to studying collective cognition has been underexplored. We lay out how LLMs may be able to address the complexity that has hindered the study of collectives and raise possible risks that warrant new methods.
大型语言模型(llm)已经改变了个体认知的研究,但它们在研究集体认知方面的应用还没有得到充分的探索。我们列出了法学硕士如何能够解决阻碍集体研究的复杂性,并提出了需要新方法的可能风险。
{"title":"Using LLMs to advance the cognitive science of collectives","authors":"Ilia Sucholutsky, Katherine M. Collins, Nori Jacoby, Bill D. Thompson, Robert D. Hawkins","doi":"10.1038/s43588-025-00848-z","DOIUrl":"10.1038/s43588-025-00848-z","url":null,"abstract":"Large language models (LLMs) are already transforming the study of individual cognition, but their application to studying collective cognition has been underexplored. We lay out how LLMs may be able to address the complexity that has hindered the study of collectives and raise possible risks that warrant new methods.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"704-707"},"PeriodicalIF":18.3,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145031329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Urban planning in the era of large language models 大语言时代的城市规划模型。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-08 DOI: 10.1038/s43588-025-00846-1
Yu Zheng, Fengli Xu, Yuming Lin, Paolo Santi, Carlo Ratti, Qi R. Wang, Yong Li
City plans are the product of integrating human creativity with emerging technologies, which continuously evolve and reshape urban morphology and environments. Here we argue that large language models hold large untapped potential in addressing the growing complexities of urban planning and enabling a more holistic, innovative and responsive approach to city design. By harnessing their advanced generation and simulation capabilities, large language models can contribute as an intelligent assistant for human planners in synthesizing conceptual ideas, generating urban designs and evaluating the outcomes of planning efforts. Large language models remain largely unexplored is the design of cities. In this Perspective, the authors discuss the potential opportunities brought by these models in assisting urban planning.
城市规划是人类创造力与新兴技术相结合的产物,这些技术不断发展和重塑城市形态和环境。在这里,我们认为大型语言模型在解决日益复杂的城市规划和实现更全面、创新和响应的城市设计方法方面具有巨大的未开发潜力。通过利用其先进的生成和模拟能力,大型语言模型可以作为人类规划者的智能助手,在综合概念想法、生成城市设计和评估规划工作的结果方面做出贡献。
{"title":"Urban planning in the era of large language models","authors":"Yu Zheng, Fengli Xu, Yuming Lin, Paolo Santi, Carlo Ratti, Qi R. Wang, Yong Li","doi":"10.1038/s43588-025-00846-1","DOIUrl":"10.1038/s43588-025-00846-1","url":null,"abstract":"City plans are the product of integrating human creativity with emerging technologies, which continuously evolve and reshape urban morphology and environments. Here we argue that large language models hold large untapped potential in addressing the growing complexities of urban planning and enabling a more holistic, innovative and responsive approach to city design. By harnessing their advanced generation and simulation capabilities, large language models can contribute as an intelligent assistant for human planners in synthesizing conceptual ideas, generating urban designs and evaluating the outcomes of planning efforts. Large language models remain largely unexplored is the design of cities. In this Perspective, the authors discuss the potential opportunities brought by these models in assisting urban planning.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"727-736"},"PeriodicalIF":18.3,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145024878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The other AI revolution: how the Global South is building and repurposing language models that speak to billions 另一场人工智能革命:全球南方如何建立和重新利用与数十亿人对话的语言模型。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-08 DOI: 10.1038/s43588-025-00865-y
Pedro Burgos
While leading tech companies race to build ever-larger models, researchers in Brazil, India and Africa are using clever tricks to remix big labs’ LLMs to bring AI to billions of users.
在领先的科技公司竞相打造规模越来越大的模型之际,巴西、印度和非洲的研究人员正在使用巧妙的手段,将大型实验室的法学硕士课程进行重组,以将人工智能带给数十亿用户。
{"title":"The other AI revolution: how the Global South is building and repurposing language models that speak to billions","authors":"Pedro Burgos","doi":"10.1038/s43588-025-00865-y","DOIUrl":"10.1038/s43588-025-00865-y","url":null,"abstract":"While leading tech companies race to build ever-larger models, researchers in Brazil, India and Africa are using clever tricks to remix big labs’ LLMs to bring AI to billions of users.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"691-694"},"PeriodicalIF":18.3,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145024894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overcoming computational bottlenecks in large language models through analog in-memory computing 通过模拟内存计算克服大型语言模型中的计算瓶颈。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-08 DOI: 10.1038/s43588-025-00860-3
Yudeng Lin, Jianshi Tang
A recent study demonstrates the potential of using in-memory computing architecture for implementing large language models for an improved computational efficiency in both time and energy while maintaining a high accuracy.
最近的一项研究证明了使用内存计算架构实现大型语言模型的潜力,从而在时间和精力上提高计算效率,同时保持高精度。
{"title":"Overcoming computational bottlenecks in large language models through analog in-memory computing","authors":"Yudeng Lin, Jianshi Tang","doi":"10.1038/s43588-025-00860-3","DOIUrl":"10.1038/s43588-025-00860-3","url":null,"abstract":"A recent study demonstrates the potential of using in-memory computing architecture for implementing large language models for an improved computational efficiency in both time and energy while maintaining a high accuracy.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"711-712"},"PeriodicalIF":18.3,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145024835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analog in-memory computing attention mechanism for fast and energy-efficient large language models 模拟内存计算关注机制的快速和节能的大型语言模型。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-08 DOI: 10.1038/s43588-025-00854-1
Nathan Leroux, Paul-Philipp Manea, Chirag Sudarshan, Jan Finkbeiner, Sebastian Siegel, John Paul Strachan, Emre Neftci
Transformer networks, driven by self-attention, are central to large language models. In generative transformers, self-attention uses cache memory to store token projections, avoiding recomputation at each time step. However, graphics processing unit (GPU)-stored projections must be loaded into static random-access memory for each new generation step, causing latency and energy bottlenecks. Here we present a custom self-attention in-memory computing architecture based on emerging charge-based memories called gain cells, which can be efficiently written to store new tokens during sequence generation and enable parallel analog dot-product computation required for self-attention. However, the analog gain-cell circuits introduce non-idealities and constraints preventing the direct mapping of pre-trained models. To circumvent this problem, we design an initialization algorithm achieving text-processing performance comparable to GPT-2 without training from scratch. Our architecture reduces attention latency and energy consumption by up to two and four orders of magnitude, respectively, compared with GPUs, marking a substantial step toward ultrafast, low-power generative transformers. Leveraging in-memory computing with emerging gain-cell devices, the authors accelerate attention—a core mechanism in large language models. They train a 1.5-billion-parameter model, achieving up to a 70,000-fold reduction in energy consumption and a 100-fold speed-up compared with GPUs.
由自我关注驱动的变压器网络是大型语言模型的核心。在生成式变形器中,自注意使用缓存存储器来存储标记投影,避免了每个时间步的重新计算。然而,图形处理单元(GPU)存储的投影必须在每个新生成步骤中加载到静态随机访问存储器中,从而导致延迟和能量瓶颈。在这里,我们提出了一种自定义的自关注内存计算架构,该架构基于新兴的基于电荷的存储器,称为增益单元,可以有效地编写以存储序列生成期间的新令牌,并实现自关注所需的并行模拟点积计算。然而,模拟增益单元电路引入了非理想性和限制,阻碍了预训练模型的直接映射。为了避免这个问题,我们设计了一个初始化算法,实现了与GPT-2相当的文本处理性能,而无需从头开始训练。与gpu相比,我们的架构将注意力延迟和能耗分别降低了两个和四个数量级,标志着向超快、低功耗生成变压器迈出了实质性的一步。
{"title":"Analog in-memory computing attention mechanism for fast and energy-efficient large language models","authors":"Nathan Leroux, Paul-Philipp Manea, Chirag Sudarshan, Jan Finkbeiner, Sebastian Siegel, John Paul Strachan, Emre Neftci","doi":"10.1038/s43588-025-00854-1","DOIUrl":"10.1038/s43588-025-00854-1","url":null,"abstract":"Transformer networks, driven by self-attention, are central to large language models. In generative transformers, self-attention uses cache memory to store token projections, avoiding recomputation at each time step. However, graphics processing unit (GPU)-stored projections must be loaded into static random-access memory for each new generation step, causing latency and energy bottlenecks. Here we present a custom self-attention in-memory computing architecture based on emerging charge-based memories called gain cells, which can be efficiently written to store new tokens during sequence generation and enable parallel analog dot-product computation required for self-attention. However, the analog gain-cell circuits introduce non-idealities and constraints preventing the direct mapping of pre-trained models. To circumvent this problem, we design an initialization algorithm achieving text-processing performance comparable to GPT-2 without training from scratch. Our architecture reduces attention latency and energy consumption by up to two and four orders of magnitude, respectively, compared with GPUs, marking a substantial step toward ultrafast, low-power generative transformers. Leveraging in-memory computing with emerging gain-cell devices, the authors accelerate attention—a core mechanism in large language models. They train a 1.5-billion-parameter model, achieving up to a 70,000-fold reduction in energy consumption and a 100-fold speed-up compared with GPUs.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"813-824"},"PeriodicalIF":18.3,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s43588-025-00854-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145024889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A digital twin that interprets and refines chemical mechanisms 一个解释和完善化学机制的数字双胞胎。
IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-09-01 DOI: 10.1038/s43588-025-00859-w
An integrated platform, Digital Twin for Chemical Science (DTCS), is developed to connect first-principles theory with spectroscopic measurements through a bidirectional feedback loop. By predicting and refining chemical reaction mechanisms before, during and after experiments, DTCS enables the interpretation of spectra and supports real-time decision-making in chemical characterization.
开发了一个集成平台,化学科学数字孪生(DTCS),通过双向反馈回路将第一线原理理论与光谱测量联系起来。通过在实验前、实验中和实验后预测和完善化学反应机制,DTCS能够对光谱进行解释,并支持化学表征的实时决策。
{"title":"A digital twin that interprets and refines chemical mechanisms","authors":"","doi":"10.1038/s43588-025-00859-w","DOIUrl":"10.1038/s43588-025-00859-w","url":null,"abstract":"An integrated platform, Digital Twin for Chemical Science (DTCS), is developed to connect first-principles theory with spectroscopic measurements through a bidirectional feedback loop. By predicting and refining chemical reaction mechanisms before, during and after experiments, DTCS enables the interpretation of spectra and supports real-time decision-making in chemical characterization.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 9","pages":"713-714"},"PeriodicalIF":18.3,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144981688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Nature computational science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1