首页 > 最新文献

Nature Machine Intelligence最新文献

英文 中文
A family of large language models for materials research with insights into model adaptability in continued pretraining 一个用于材料研究的大型语言模型家族,具有持续预训练中模型适应性的见解
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-27 DOI: 10.1038/s42256-026-01199-8
Dhruv Ahlawat, Vaibhav Mishra, Somaditya Singh, Mohd Zaki, Vaibhav Bihani, Hargun Singh Grover, Biswajit Mishra, Santiago Miret, Mausam, N. M. Anoop Krishnan
Materials discovery and development are critical for addressing global challenges in renewable energy, sustainability, and advanced technology. Large language models (LLMs) offer unprecedented opportunities to accelerate materials research, yet their effective deployment requires domain-specific adaptation. Here we present large language models for materials (LLaMat), a family of foundational models for materials science, developed through continued pretraining of LLaMA models on 30 billion tokens derived from approximately 4 million materials science publications and crystallographic data. To develop a materials copilot, the models were adapted by instruction and task fine-tuning on 175,000 materials science question-answering pairs. Through evaluation across 42 tasks covering the entire spectrum of materials research, spanning natural language processing, structured information extraction and crystal generation, we demonstrate that LLaMat consistently outperforms state-of-the-art commercial LLMs (Claude, GPT and Gemini) while maintaining general linguistic capabilities. Beyond demonstrating the effectiveness of domain adaptation for practically deployable materials research copilots, our findings also reveal fundamental insights about LLM adaptation that may influence the development of specialized scientific artificial intelligence systems across domains. For instance, we identify increasing rigidity to domain adaptation in extensively pretrained LLMs such as LLaMA-3. This consistent pattern observed across the experiments suggests a previously unidentified ‘adaptation rigidity’, where overtrained LLMs exhibit increasing rigidity to domain adaptation.
材料的发现和开发对于解决可再生能源、可持续性和先进技术方面的全球挑战至关重要。大型语言模型(llm)为加速材料研究提供了前所未有的机会,但它们的有效部署需要特定领域的适应。在这里,我们提出了材料的大型语言模型(LLaMat),这是一系列材料科学的基础模型,通过对来自大约400万份材料科学出版物和晶体学数据的300亿个符号的LLaMA模型进行持续预训练而开发出来的。为了开发材料副驾驶,对175,000对材料科学问答对进行了指令和任务微调。通过对42项任务的评估,涵盖了材料研究的整个范围,包括自然语言处理、结构化信息提取和晶体生成,我们证明LLaMat在保持一般语言能力的同时,始终优于最先进的商业法学硕士(Claude、GPT和Gemini)。除了展示领域适应对实际可部署材料研究副驾驶员的有效性之外,我们的研究结果还揭示了法学硕士适应的基本见解,这可能会影响跨领域专业科学人工智能系统的发展。例如,我们发现在广泛预训练的llm(如LLaMA-3)中,域适应的刚性越来越强。在整个实验中观察到的这种一致的模式表明了以前未被识别的“适应刚性”,其中过度训练的llm对领域适应表现出越来越强的刚性。
{"title":"A family of large language models for materials research with insights into model adaptability in continued pretraining","authors":"Dhruv Ahlawat, Vaibhav Mishra, Somaditya Singh, Mohd Zaki, Vaibhav Bihani, Hargun Singh Grover, Biswajit Mishra, Santiago Miret, Mausam, N. M. Anoop Krishnan","doi":"10.1038/s42256-026-01199-8","DOIUrl":"https://doi.org/10.1038/s42256-026-01199-8","url":null,"abstract":"Materials discovery and development are critical for addressing global challenges in renewable energy, sustainability, and advanced technology. Large language models (LLMs) offer unprecedented opportunities to accelerate materials research, yet their effective deployment requires domain-specific adaptation. Here we present large language models for materials (LLaMat), a family of foundational models for materials science, developed through continued pretraining of LLaMA models on 30 billion tokens derived from approximately 4 million materials science publications and crystallographic data. To develop a materials copilot, the models were adapted by instruction and task fine-tuning on 175,000 materials science question-answering pairs. Through evaluation across 42 tasks covering the entire spectrum of materials research, spanning natural language processing, structured information extraction and crystal generation, we demonstrate that LLaMat consistently outperforms state-of-the-art commercial LLMs (Claude, GPT and Gemini) while maintaining general linguistic capabilities. Beyond demonstrating the effectiveness of domain adaptation for practically deployable materials research copilots, our findings also reveal fundamental insights about LLM adaptation that may influence the development of specialized scientific artificial intelligence systems across domains. For instance, we identify increasing rigidity to domain adaptation in extensively pretrained LLMs such as LLaMA-3. This consistent pattern observed across the experiments suggests a previously unidentified ‘adaptation rigidity’, where overtrained LLMs exhibit increasing rigidity to domain adaptation.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"420 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2026-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147320214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional diffusion with locality-aware modal alignment for generating diverse protein conformational ensembles 条件扩散与位置感知模式对准产生不同的蛋白质构象集成
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-25 DOI: 10.1038/s42256-026-01198-9
Baoli Wang, Chenglin Wang, Jingyang Chen, Danlin Liu, Changzhi Sun, Jie Zhang, Kai Zhang, Honglin Li
{"title":"Conditional diffusion with locality-aware modal alignment for generating diverse protein conformational ensembles","authors":"Baoli Wang, Chenglin Wang, Jingyang Chen, Danlin Liu, Changzhi Sun, Jie Zhang, Kai Zhang, Honglin Li","doi":"10.1038/s42256-026-01198-9","DOIUrl":"https://doi.org/10.1038/s42256-026-01198-9","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"176 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147279246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI and the long game 人工智能和长期博弈
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-24 DOI: 10.1038/s42256-026-01203-1
Almost 10 years ago, AlphaGo defeated one of the world’s best professional players in the complex, ancient game of Go. It was a pivotal moment that spawned new research directions and marked the beginning of a busy decade in AI development.
大约10年前,AlphaGo在复杂而古老的围棋比赛中击败了世界上最好的职业棋手之一。这是一个关键时刻,催生了新的研究方向,标志着人工智能发展忙碌的十年的开始。
{"title":"AI and the long game","authors":"","doi":"10.1038/s42256-026-01203-1","DOIUrl":"10.1038/s42256-026-01203-1","url":null,"abstract":"Almost 10 years ago, AlphaGo defeated one of the world’s best professional players in the complex, ancient game of Go. It was a pivotal moment that spawned new research directions and marked the beginning of a busy decade in AI development.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"8 2","pages":"135-135"},"PeriodicalIF":23.9,"publicationDate":"2026-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-026-01203-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147275120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Author Correction: Mask-prior-guided denoising diffusion improves inverse protein folding 作者更正:掩模先验引导的去噪扩散改善了反向蛋白质折叠
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-24 DOI: 10.1038/s42256-026-01209-9
Peizhen Bai, Filip Miljković, Xianyuan Liu, Leonardo De Maria, Rebecca Croasdale-Wood, Owen Rackham, Haiping Lu
{"title":"Author Correction: Mask-prior-guided denoising diffusion improves inverse protein folding","authors":"Peizhen Bai, Filip Miljković, Xianyuan Liu, Leonardo De Maria, Rebecca Croasdale-Wood, Owen Rackham, Haiping Lu","doi":"10.1038/s42256-026-01209-9","DOIUrl":"https://doi.org/10.1038/s42256-026-01209-9","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"21 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2026-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147279247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals 使用基于170万个人数据预训练的多模式基础模型,跨场景和设备进行心脏健康评估
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-24 DOI: 10.1038/s42256-026-01180-5
Xiao Gu, Wei Tang, Jinpei Han, Veer Sangha, Fenglin Liu, Shreyank N. Gowda, Antonio H. Ribeiro, Patrick Schwab, Kim Branson, Lei Clifton, Antonio Luiz P. Ribeiro, Zhangdaihong Liu, David A. Clifton
Cardiovascular diseases remain a major contributor to the global burden of healthcare, highlighting the importance of accurate and scalable methods for cardiac monitoring. Cardiac biosignals, most notably electrocardiograms (ECG) and photoplethysmograms, are essential for diagnosing, preventing and managing cardiovascular conditions across clinical and home settings. However, their acquisition varies substantially across scenarios and devices, whereas existing analytical models often rely on homogeneous datasets and static bespoke models, limiting their robustness and generalizability in diverse real-world contexts. Here we present a cardiac sensing foundation model (CSFM) that leverages transformer architectures and a generative masked pretraining strategy to learn unified representations from heterogeneous health records. CSFM is pretrained on a multimodal integration of data from various large-scale datasets, comprising cardiac signals from approximately 1.7 million individuals and their corresponding clinical or machine-generated text reports. The embeddings derived from CSFM act as effective, transferable features across diverse cardiac sensing scenarios, supporting a seamless adaptation to the varied input configurations and sensor modalities. Extensive evaluations across diagnostic tasks, demographic recognition, vital sign measurement, clinical outcome prediction and ECG question answering demonstrate that CSFM consistently outperforms traditional one-modal-one-task approaches. Notably, CSFM maintains favourable performance across both 12-lead and single-lead ECGs, as well as in scenarios involving ECG only, photoplethysmogram only or a combination of both. This highlights its potential as a versatile and scalable foundation for comprehensive cardiac monitoring. Gu et al. introduce a cardiac foundation model that learns from millions of heart signals and textual interpretations, enabling it to handle heart data collected either in hospitals or at home. It offers clear and reliable insights across different devices and settings.
心血管疾病仍然是全球卫生保健负担的主要贡献者,这突出了准确和可扩展的心脏监测方法的重要性。心脏生物信号,尤其是心电图(ECG)和光容积描记图,对于诊断、预防和管理临床和家庭环境中的心血管疾病至关重要。然而,它们的获取在不同的场景和设备上有很大的不同,而现有的分析模型通常依赖于同质数据集和静态定制模型,限制了它们在不同现实环境中的稳健性和泛化性。在这里,我们提出了一个心脏传感基础模型(CSFM),该模型利用变压器架构和生成掩膜预训练策略从异构健康记录中学习统一表示。CSFM是在多模态集成数据上进行预训练的,这些数据来自各种大规模数据集,包括来自大约170万人的心脏信号及其相应的临床或机器生成的文本报告。来自CSFM的嵌入在不同的心脏传感场景中作为有效的、可转移的特征,支持对不同输入配置和传感器模式的无缝适应。对诊断任务、人口统计识别、生命体征测量、临床结果预测和心电图问题回答的广泛评估表明,CSFM始终优于传统的单模式一任务方法。值得注意的是,CSFM在12导联和单导联心电图中都保持了良好的性能,在仅包括心电图、仅光容积描记图或两者结合的情况下也保持了良好的性能。这突出了其作为全面心脏监测的通用和可扩展基础的潜力。Gu等人介绍了一种心脏基础模型,该模型可以从数百万个心脏信号和文本解释中学习,使其能够处理在医院或家中收集的心脏数据。它为不同的设备和设置提供清晰可靠的见解。
{"title":"Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals","authors":"Xiao Gu, Wei Tang, Jinpei Han, Veer Sangha, Fenglin Liu, Shreyank N. Gowda, Antonio H. Ribeiro, Patrick Schwab, Kim Branson, Lei Clifton, Antonio Luiz P. Ribeiro, Zhangdaihong Liu, David A. Clifton","doi":"10.1038/s42256-026-01180-5","DOIUrl":"10.1038/s42256-026-01180-5","url":null,"abstract":"Cardiovascular diseases remain a major contributor to the global burden of healthcare, highlighting the importance of accurate and scalable methods for cardiac monitoring. Cardiac biosignals, most notably electrocardiograms (ECG) and photoplethysmograms, are essential for diagnosing, preventing and managing cardiovascular conditions across clinical and home settings. However, their acquisition varies substantially across scenarios and devices, whereas existing analytical models often rely on homogeneous datasets and static bespoke models, limiting their robustness and generalizability in diverse real-world contexts. Here we present a cardiac sensing foundation model (CSFM) that leverages transformer architectures and a generative masked pretraining strategy to learn unified representations from heterogeneous health records. CSFM is pretrained on a multimodal integration of data from various large-scale datasets, comprising cardiac signals from approximately 1.7 million individuals and their corresponding clinical or machine-generated text reports. The embeddings derived from CSFM act as effective, transferable features across diverse cardiac sensing scenarios, supporting a seamless adaptation to the varied input configurations and sensor modalities. Extensive evaluations across diagnostic tasks, demographic recognition, vital sign measurement, clinical outcome prediction and ECG question answering demonstrate that CSFM consistently outperforms traditional one-modal-one-task approaches. Notably, CSFM maintains favourable performance across both 12-lead and single-lead ECGs, as well as in scenarios involving ECG only, photoplethysmogram only or a combination of both. This highlights its potential as a versatile and scalable foundation for comprehensive cardiac monitoring. Gu et al. introduce a cardiac foundation model that learns from millions of heart signals and textual interpretations, enabling it to handle heart data collected either in hospitals or at home. It offers clear and reliable insights across different devices and settings.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"8 2","pages":"220-233"},"PeriodicalIF":23.9,"publicationDate":"2026-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-026-01180-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147275125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synthetic X‑ray‑driven tracking and control of miniature medical devices 微型医疗设备的合成X射线驱动跟踪和控制
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-23 DOI: 10.1038/s42256-026-01190-3
Chunxiang Wang, Wenbin Kang, Mengmeng Sun, Hongchuan Zhang, Chong Hong, Sinan Ozgun Demir, Halim Ugurlu, Kun Hao, Zemin Liu, Tianlu Wang, Metin Sitti
The clinical translation of miniature medical devices (MMDs) for minimally invasive surgery promises transformative advances in biomedical engineering, offering enhanced precision, reduced patient trauma and faster recovery times. However, their effective deployment in complex anatomies under real-time X-ray guidance—a widely used surgical imaging modality—presents challenges such as low imaging quality and difficulties of spatial MMD control. Manual identification and operation are labour intensive and error prone. Meanwhile, deep learning-based automation is limited by the scarcity of annotated X-ray datasets of MMDs owing to costly data collection, laborious annotation and privacy constraints. Here we introduce MicroSyn-X, a framework for training computer vision models to enable robotic teleoperation of MMDs using synthesized high-fidelity, pixel-accurate, auto-labelled and domain-randomized X-ray images, eliminating manual data curation. Integrating MicroSyn-X into a teleoperated robotic system enables real-time localization and navigation of magnetic soft and magnetic liquid MMDs within both ex vivo and dynamic in vivo environments, demonstrating robustness under challenging imaging conditions of low contrast, high noise and occlusion. With these promises, we open source the X-ray MMD dataset to enable benchmarking. Addressing data scarcity and enabling real-time robotic navigation, this work advances MMD-assisted minimally invasive surgery towards next-generation precision interventions. Wang et al. introduce MicroSyn-X, a synthetic X-ray data generation framework that overcomes data scarcity in miniature medical devices, enabling robust deep learning-based tracking and real-time robotic navigation in challenging surgical settings.
用于微创手术的微型医疗设备(mmd)的临床转化有望在生物医学工程方面取得革命性进展,提供更高的精度,减少患者创伤和更快的恢复时间。然而,在实时x射线引导下,它们在复杂解剖结构中的有效部署(一种广泛使用的外科成像方式)面临着成像质量低和空间烟雾病控制困难等挑战。人工识别和操作是劳动密集型的,容易出错。同时,由于数据收集成本高、注释费力和隐私约束,基于深度学习的自动化受到MMDs注释x射线数据集稀缺的限制。在这里,我们介绍了MicroSyn-X,这是一个用于训练计算机视觉模型的框架,可以使用合成的高保真、像素精确、自动标记和领域随机化的x射线图像来实现mmd的机器人远程操作,从而消除了人工数据管理。将MicroSyn-X集成到远程操作机器人系统中,可以在离体和动态体内环境中实现磁性软磁和磁性液体mmd的实时定位和导航,在低对比度、高噪声和遮挡等具有挑战性的成像条件下显示出鲁棒性。有了这些承诺,我们开放了X-ray MMD数据集的源代码,以实现基准测试。该研究解决了数据短缺问题,实现了实时机器人导航,将烟雾辅助微创手术推向了下一代精确干预。Wang等人介绍了MicroSyn-X,这是一种合成x射线数据生成框架,克服了微型医疗设备中的数据稀缺,在具有挑战性的手术环境中实现了基于深度学习的强大跟踪和实时机器人导航。
{"title":"Synthetic X‑ray‑driven tracking and control of miniature medical devices","authors":"Chunxiang Wang, Wenbin Kang, Mengmeng Sun, Hongchuan Zhang, Chong Hong, Sinan Ozgun Demir, Halim Ugurlu, Kun Hao, Zemin Liu, Tianlu Wang, Metin Sitti","doi":"10.1038/s42256-026-01190-3","DOIUrl":"10.1038/s42256-026-01190-3","url":null,"abstract":"The clinical translation of miniature medical devices (MMDs) for minimally invasive surgery promises transformative advances in biomedical engineering, offering enhanced precision, reduced patient trauma and faster recovery times. However, their effective deployment in complex anatomies under real-time X-ray guidance—a widely used surgical imaging modality—presents challenges such as low imaging quality and difficulties of spatial MMD control. Manual identification and operation are labour intensive and error prone. Meanwhile, deep learning-based automation is limited by the scarcity of annotated X-ray datasets of MMDs owing to costly data collection, laborious annotation and privacy constraints. Here we introduce MicroSyn-X, a framework for training computer vision models to enable robotic teleoperation of MMDs using synthesized high-fidelity, pixel-accurate, auto-labelled and domain-randomized X-ray images, eliminating manual data curation. Integrating MicroSyn-X into a teleoperated robotic system enables real-time localization and navigation of magnetic soft and magnetic liquid MMDs within both ex vivo and dynamic in vivo environments, demonstrating robustness under challenging imaging conditions of low contrast, high noise and occlusion. With these promises, we open source the X-ray MMD dataset to enable benchmarking. Addressing data scarcity and enabling real-time robotic navigation, this work advances MMD-assisted minimally invasive surgery towards next-generation precision interventions. Wang et al. introduce MicroSyn-X, a synthetic X-ray data generation framework that overcomes data scarcity in miniature medical devices, enabling robust deep learning-based tracking and real-time robotic navigation in challenging surgical settings.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"8 2","pages":"276-291"},"PeriodicalIF":23.9,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-026-01190-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147275126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A large-scale randomized study of large language model feedback in peer review 同行评议中大型语言模型反馈的大规模随机研究
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-23 DOI: 10.1038/s42256-026-01188-x
Nitya Thakkar, Mert Yuksekgonul, Jake Silberg, Animesh Garg, Nanyun Peng, Fei Sha, Rose Yu, Carl Vondrick, James Zou
{"title":"A large-scale randomized study of large language model feedback in peer review","authors":"Nitya Thakkar, Mert Yuksekgonul, Jake Silberg, Animesh Garg, Nanyun Peng, Fei Sha, Rose Yu, Carl Vondrick, James Zou","doi":"10.1038/s42256-026-01188-x","DOIUrl":"https://doi.org/10.1038/s42256-026-01188-x","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"71 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147279256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preconditioned inexact stochastic ADMM for deep models 深度模型的预条件非精确随机ADMM
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-20 DOI: 10.1038/s42256-026-01182-3
Shenglong Zhou  (, ), Ouya Wang  (, ), Ziyan Luo  (, ), Yongxu Zhu  (, ), Geoffrey Ye Li  (, )
Deep learning models are usually trained with stochastic gradient descent-based algorithms, but these optimizers face inherent limitations, such as slow convergence and stringent assumptions for convergence. In particular, data heterogeneity arising from distributed settings poses significant challenges to their theoretical and numerical performance. Here we develop an algorithm called PISA (preconditioned inexact stochastic alternating direction method of multipliers). Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz continuity of the gradient on a bounded region, thereby removing the need for other conditions commonly imposed by stochastic methods. This capability enables the proposed algorithm to tackle the challenge of data heterogeneity effectively. Moreover, the algorithmic architecture enables scalable parallel computing and supports various preconditions, such as second-order information, second moment and orthogonalized momentum by Newton–Schulz iterations. Incorporating the last two preconditions in PISA yields two computationally efficient variants: SISA and NSISA. Comprehensive experimental evaluations for training or fine-tuning diverse deep models, including vision models, large language models, reinforcement learning models, generative adversarial networks and recurrent neural networks, demonstrate superior numerical performance of SISA and NSISA compared with various state-of-the-art optimizers. Zhou et al. develop PISA, an optimizer for deep learning models that supports heterogeneous data and various preconditions. It converges under minimal assumptions, while outperforming established methods for diverse tasks.
深度学习模型通常使用基于随机梯度下降的算法进行训练,但这些优化器面临固有的局限性,例如缓慢的收敛和严格的收敛假设。特别是,由分布式设置引起的数据异质性对其理论和数值性能提出了重大挑战。在这里,我们开发了一种称为PISA(预条件不精确随机交替方向乘数法)的算法。基于严格的理论保证,该算法在有界区域上梯度的Lipschitz连续性的唯一假设下收敛,从而消除了随机方法通常施加的其他条件。这种能力使所提出的算法能够有效地解决数据异构的挑战。此外,该算法架构支持可扩展的并行计算,并支持各种先决条件,如二阶信息、二阶矩和牛顿-舒尔茨迭代的正交化动量。将最后两个前提条件纳入PISA中,产生了两个计算效率高的变体:SISA和nisa。对训练或微调各种深度模型(包括视觉模型、大型语言模型、强化学习模型、生成对抗网络和循环神经网络)的综合实验评估表明,与各种最先进的优化器相比,SISA和nisa具有优越的数值性能。
{"title":"Preconditioned inexact stochastic ADMM for deep models","authors":"Shenglong Zhou \u0000 (, ), Ouya Wang \u0000 (, ), Ziyan Luo \u0000 (, ), Yongxu Zhu \u0000 (, ), Geoffrey Ye Li \u0000 (, )","doi":"10.1038/s42256-026-01182-3","DOIUrl":"10.1038/s42256-026-01182-3","url":null,"abstract":"Deep learning models are usually trained with stochastic gradient descent-based algorithms, but these optimizers face inherent limitations, such as slow convergence and stringent assumptions for convergence. In particular, data heterogeneity arising from distributed settings poses significant challenges to their theoretical and numerical performance. Here we develop an algorithm called PISA (preconditioned inexact stochastic alternating direction method of multipliers). Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz continuity of the gradient on a bounded region, thereby removing the need for other conditions commonly imposed by stochastic methods. This capability enables the proposed algorithm to tackle the challenge of data heterogeneity effectively. Moreover, the algorithmic architecture enables scalable parallel computing and supports various preconditions, such as second-order information, second moment and orthogonalized momentum by Newton–Schulz iterations. Incorporating the last two preconditions in PISA yields two computationally efficient variants: SISA and NSISA. Comprehensive experimental evaluations for training or fine-tuning diverse deep models, including vision models, large language models, reinforcement learning models, generative adversarial networks and recurrent neural networks, demonstrate superior numerical performance of SISA and NSISA compared with various state-of-the-art optimizers. Zhou et al. develop PISA, an optimizer for deep learning models that supports heterogeneous data and various preconditions. It converges under minimal assumptions, while outperforming established methods for diverse tasks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"8 2","pages":"234-245"},"PeriodicalIF":23.9,"publicationDate":"2026-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-026-01182-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-designing quantum experiments with language models 语言模型的元设计量子实验
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-19 DOI: 10.1038/s42256-025-01153-0
Sören Arlt, Haonan Duan, Felix Li, Sang Michael Xie, Yuhuai Wu, Mario Krenn
Artificial intelligence can solve complex scientific problems beyond human capabilities, but the resulting solutions offer little insight into the underlying physical principles. One prominent example is quantum physics, where computers can discover experiments for the generation of specific quantum states, but it is unclear how finding general design concepts can be automated. Here we address this challenge by training a transformer-based language model to create human-readable Python code that generates entire families of experiments. The model is trained on millions of synthetic examples of quantum states and their corresponding experimental blueprints, enabling it to infer general construction rules rather than isolated solutions. This strategy, which we call meta-design, enables scientists to gain a deeper understanding and to extrapolate to larger experiments without additional optimization. We demonstrate that the approach can rediscover known design principles and uncover previously unknown generalizations of important quantum states, such as those from condensed-matter physics. Beyond quantum optics, the methodology provides a blueprint for applying language models to interpretable, generalizable scientific discovery across disciplines such as materials science and engineering. Language models can write human-readable code that captures general design rules, generating whole families of quantum experiments at once. A design strategy described here makes results interpretable and scalable, as well as accelerates discovery.
人工智能可以解决超出人类能力的复杂科学问题,但由此产生的解决方案几乎无法洞察潜在的物理原理。一个突出的例子是量子物理学,计算机可以发现生成特定量子态的实验,但目前还不清楚如何自动找到一般的设计概念。在这里,我们通过训练一个基于转换器的语言模型来创建人类可读的Python代码来解决这个挑战,这些代码可以生成整个实验系列。该模型是在数百万个量子态的合成例子及其相应的实验蓝图上进行训练的,使其能够推断出一般的构造规则,而不是孤立的解决方案。这种策略,我们称之为元设计,使科学家能够获得更深入的理解,并推断出更大的实验,而无需额外的优化。我们证明,该方法可以重新发现已知的设计原理,并揭示以前未知的重要量子态的概括,例如来自凝聚态物理的量子态。除了量子光学之外,该方法还为将语言模型应用于材料科学和工程等学科的可解释、可推广的科学发现提供了蓝图。
{"title":"Meta-designing quantum experiments with language models","authors":"Sören Arlt, Haonan Duan, Felix Li, Sang Michael Xie, Yuhuai Wu, Mario Krenn","doi":"10.1038/s42256-025-01153-0","DOIUrl":"10.1038/s42256-025-01153-0","url":null,"abstract":"Artificial intelligence can solve complex scientific problems beyond human capabilities, but the resulting solutions offer little insight into the underlying physical principles. One prominent example is quantum physics, where computers can discover experiments for the generation of specific quantum states, but it is unclear how finding general design concepts can be automated. Here we address this challenge by training a transformer-based language model to create human-readable Python code that generates entire families of experiments. The model is trained on millions of synthetic examples of quantum states and their corresponding experimental blueprints, enabling it to infer general construction rules rather than isolated solutions. This strategy, which we call meta-design, enables scientists to gain a deeper understanding and to extrapolate to larger experiments without additional optimization. We demonstrate that the approach can rediscover known design principles and uncover previously unknown generalizations of important quantum states, such as those from condensed-matter physics. Beyond quantum optics, the methodology provides a blueprint for applying language models to interpretable, generalizable scientific discovery across disciplines such as materials science and engineering. Language models can write human-readable code that captures general design rules, generating whole families of quantum experiments at once. A design strategy described here makes results interpretable and scalable, as well as accelerates discovery.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"8 2","pages":"148-157"},"PeriodicalIF":23.9,"publicationDate":"2026-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01153-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146223065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel hierarchical encoding of linguistic representations in the human auditory cortex and recurrent automatic speech recognition systems 人类听觉皮层语言表征的并行层次编码与循环自动语音识别系统
IF 23.9 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-17 DOI: 10.1038/s42256-026-01185-0
Menoua Keshishian, Gavin Mischler, Samuel Thomas, Brian Kingsbury, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani
Transforming continuous acoustic speech signals into discrete linguistic meaning is a remarkable computational feat accomplished by both the human brain and modern artificial intelligence. A key scientific question is whether these biological and artificial systems, despite their different architectures, converge on similar strategies to solve this challenge. Although automatic speech recognition systems now achieve human-level performance, research on their parallels with the brain has been limited by biologically implausible, non-causal models and comparisons that stop at predicting brain activity without detailing the alignment of the underlying representations. Furthermore, studies using text-based models overlook the crucial acoustic stages of speech processing. Here we bridge these gaps by uncovering a striking correspondence between the brain’s processing hierarchy and the model’s internal representations using high-resolution intracranial recordings and a causal, recurrent automatic speech recognition model. Specifically, we demonstrate a deep alignment in their algorithmic approach: neural activity in distinct cortical regions maps topographically to corresponding model layers, and critically, the representational content at each stage follows a parallel progression from acoustic to phonetic, lexical and semantic information. This work thus moves beyond demonstrating simple model–brain alignment to specifying the shared underlying representations at each stage of processing, providing direct evidence that both systems converge on a similar computational strategy for transforming sound into meaning. Keshishian, Mischler et al. report that a recurrent automatic speech recognition system aligns closely with brain organization: model layers map to distinct cortical regions and naturally learn to encode a parallel progression from acoustic to phonetic, lexical and semantic content.
将连续的声音语音信号转化为离散的语言意义是人类大脑和现代人工智能共同完成的一项非凡的计算壮举。一个关键的科学问题是,尽管这些生物和人工系统的架构不同,但它们是否会采用类似的策略来解决这一挑战。尽管自动语音识别系统现在达到了人类水平的表现,但它们与大脑的相似之处的研究一直受到生物学上不可信的、非因果模型和比较的限制,这些模型和比较停留在预测大脑活动,而没有详细说明潜在表征的一致性。此外,使用基于文本的模型的研究忽略了语音处理的关键声学阶段。在这里,我们通过使用高分辨率颅内记录和因果循环的自动语音识别模型,揭示了大脑处理层次和模型内部表征之间惊人的对应关系,从而弥合了这些差距。具体来说,我们展示了他们的算法方法的深度一致性:不同皮层区域的神经活动以地形映射到相应的模型层,关键的是,每个阶段的表征内容遵循从声学到语音、词汇和语义信息的平行进展。因此,这项工作超越了简单的模型-大脑对齐,指明了每个处理阶段共享的潜在表征,提供了直接证据,证明两个系统在将声音转化为意义的计算策略上趋同。Keshishian, Mischler等人报告说,循环自动语音识别系统与大脑组织密切相关:模型层映射到不同的皮层区域,并自然地学习编码从声学到语音、词汇和语义内容的平行进程。
{"title":"Parallel hierarchical encoding of linguistic representations in the human auditory cortex and recurrent automatic speech recognition systems","authors":"Menoua Keshishian, Gavin Mischler, Samuel Thomas, Brian Kingsbury, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani","doi":"10.1038/s42256-026-01185-0","DOIUrl":"10.1038/s42256-026-01185-0","url":null,"abstract":"Transforming continuous acoustic speech signals into discrete linguistic meaning is a remarkable computational feat accomplished by both the human brain and modern artificial intelligence. A key scientific question is whether these biological and artificial systems, despite their different architectures, converge on similar strategies to solve this challenge. Although automatic speech recognition systems now achieve human-level performance, research on their parallels with the brain has been limited by biologically implausible, non-causal models and comparisons that stop at predicting brain activity without detailing the alignment of the underlying representations. Furthermore, studies using text-based models overlook the crucial acoustic stages of speech processing. Here we bridge these gaps by uncovering a striking correspondence between the brain’s processing hierarchy and the model’s internal representations using high-resolution intracranial recordings and a causal, recurrent automatic speech recognition model. Specifically, we demonstrate a deep alignment in their algorithmic approach: neural activity in distinct cortical regions maps topographically to corresponding model layers, and critically, the representational content at each stage follows a parallel progression from acoustic to phonetic, lexical and semantic information. This work thus moves beyond demonstrating simple model–brain alignment to specifying the shared underlying representations at each stage of processing, providing direct evidence that both systems converge on a similar computational strategy for transforming sound into meaning. Keshishian, Mischler et al. report that a recurrent automatic speech recognition system aligns closely with brain organization: model layers map to distinct cortical regions and naturally learn to encode a parallel progression from acoustic to phonetic, lexical and semantic content.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"8 2","pages":"257-269"},"PeriodicalIF":23.9,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146205128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Nature Machine Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1