Nature Machine Intelligence最新文献

英文中文

Discovering fully semantic representations via centroid- and orientation-aware feature learning

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-06 DOI: 10.1038/s42256-024-00978-5

Jaehoon Cha, Jinhae Park, Samuel Pinilla, Kyle L. Morris, Christopher S. Allen, Mark I. Wilkinson, Jeyan Thiyagalingam

Learning meaningful representations of images in scientific domains that are robust to variations in centroids and orientations remains an important challenge. Here we introduce centroid- and orientation-aware disentangling autoencoder (CODAE), an encoder–decoder-based neural network that learns meaningful content of objects in a latent space. Specifically, a combination of a translation- and rotation-equivariant encoder, Euler encoding and an image moment loss enables CODAE to extract features invariant to positions and orientations of objects of interest from randomly translated and rotated images. We evaluate this approach on several publicly available scientific datasets, including protein images from life sciences, four-dimensional scanning transmission electron microscopy data from material science and galaxy images from astronomy. The evaluation shows that CODAE learns centroids, orientations and their invariant features and outputs, as well as aligned reconstructions and the exact view reconstructions of the input images with high quality. Cha and colleagues present a translation- and rotation-equivariant autoencoder-based method for robust image recognition, which they demonstrate on diverse tasks from bioinformatics, material science and astronomy.

在科学领域中学习有意义的图像表征，使其对中心点和方向的变化具有鲁棒性，仍然是一项重要的挑战。在此，我们介绍了中心点和方向感知解离自动编码器（CODAE），这是一种基于编码器-解码器的神经网络，可学习潜在空间中物体的有意义内容。具体来说，结合平移和旋转等价编码器、欧拉编码和图像力矩损失，CODAE 能够从随机平移和旋转的图像中提取与感兴趣对象的位置和方向无关的特征。我们在几个公开的科学数据集上对这种方法进行了评估，包括生命科学中的蛋白质图像、材料科学中的四维扫描透射电子显微镜数据和天文学中的星系图像。评估结果表明，CODAE 能学习输入图像的中心点、方向及其不变特征和输出，以及高质量的对齐重建和精确视图重建。

{"title":"Discovering fully semantic representations via centroid- and orientation-aware feature learning","authors":"Jaehoon Cha, Jinhae Park, Samuel Pinilla, Kyle L. Morris, Christopher S. Allen, Mark I. Wilkinson, Jeyan Thiyagalingam","doi":"10.1038/s42256-024-00978-5","DOIUrl":"10.1038/s42256-024-00978-5","url":null,"abstract":"Learning meaningful representations of images in scientific domains that are robust to variations in centroids and orientations remains an important challenge. Here we introduce centroid- and orientation-aware disentangling autoencoder (CODAE), an encoder–decoder-based neural network that learns meaningful content of objects in a latent space. Specifically, a combination of a translation- and rotation-equivariant encoder, Euler encoding and an image moment loss enables CODAE to extract features invariant to positions and orientations of objects of interest from randomly translated and rotated images. We evaluate this approach on several publicly available scientific datasets, including protein images from life sciences, four-dimensional scanning transmission electron microscopy data from material science and galaxy images from astronomy. The evaluation shows that CODAE learns centroids, orientations and their invariant features and outputs, as well as aligned reconstructions and the exact view reconstructions of the input images with high quality. Cha and colleagues present a translation- and rotation-equivariant autoencoder-based method for robust image recognition, which they demonstrate on diverse tasks from bioinformatics, material science and astronomy.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"307-314"},"PeriodicalIF":18.8,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00978-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Preserving and combining knowledge in robotic lifelong reinforcement learning

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-05 DOI: 10.1038/s42256-025-00983-2

Yuan Meng, Zhenshan Bing, Xiangtong Yao, Kejia Chen, Kai Huang, Yang Gao, Fuchun Sun, Alois Knoll

Humans can continually accumulate knowledge and develop increasingly complex behaviours and skills throughout their lives, which is a capability known as ‘lifelong learning’. Although this lifelong learning capability is considered an essential mechanism that makes up general intelligence, recent advancements in artificial intelligence predominantly excel in narrow, specialized domains and generally lack this lifelong learning capability. Here we introduce a robotic lifelong reinforcement learning framework that addresses this gap by developing a knowledge space inspired by the Bayesian non-parametric domain. In addition, we enhance the agent’s semantic understanding of tasks by integrating language embeddings into the framework. Our proposed embodied agent can consistently accumulate knowledge from a continuous stream of one-time feeding tasks. Furthermore, our agent can tackle challenging real-world long-horizon tasks by combining and reapplying its acquired knowledge from the original tasks stream. The proposed framework advances our understanding of the robotic lifelong learning process and may inspire the development of more broadly applicable intelligence. Humans continuously acquire knowledge and develop complex behaviours. Meng, Bing, Yao and colleagues present a robotic lifelong learning framework using a Bayesian non-parametric knowledge space, enabling agents to dynamically preserve and integrate knowledge from sequential tasks, enhancing adaptability.

{"title":"Preserving and combining knowledge in robotic lifelong reinforcement learning","authors":"Yuan Meng, Zhenshan Bing, Xiangtong Yao, Kejia Chen, Kai Huang, Yang Gao, Fuchun Sun, Alois Knoll","doi":"10.1038/s42256-025-00983-2","DOIUrl":"10.1038/s42256-025-00983-2","url":null,"abstract":"Humans can continually accumulate knowledge and develop increasingly complex behaviours and skills throughout their lives, which is a capability known as ‘lifelong learning’. Although this lifelong learning capability is considered an essential mechanism that makes up general intelligence, recent advancements in artificial intelligence predominantly excel in narrow, specialized domains and generally lack this lifelong learning capability. Here we introduce a robotic lifelong reinforcement learning framework that addresses this gap by developing a knowledge space inspired by the Bayesian non-parametric domain. In addition, we enhance the agent’s semantic understanding of tasks by integrating language embeddings into the framework. Our proposed embodied agent can consistently accumulate knowledge from a continuous stream of one-time feeding tasks. Furthermore, our agent can tackle challenging real-world long-horizon tasks by combining and reapplying its acquired knowledge from the original tasks stream. The proposed framework advances our understanding of the robotic lifelong learning process and may inspire the development of more broadly applicable intelligence. Humans continuously acquire knowledge and develop complex behaviours. Meng, Bing, Yao and colleagues present a robotic lifelong learning framework using a Bayesian non-parametric knowledge space, enabling agents to dynamically preserve and integrate knowledge from sequential tasks, enhancing adaptability.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"256-269"},"PeriodicalIF":18.8,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-025-00983-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143125412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Why the carbon footprint of generative large language models alone will not help us assess their sustainability

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-03 DOI: 10.1038/s42256-025-00979-y

Leonie N. Bossert, Wulf Loh

There is a growing awareness of the substantial environmental costs of large language models (LLMs), but discussing the sustainability of LLMs only in terms of CO2 emissions is not enough. This Comment emphasizes the need to take into account the social and ecological costs and benefits of LLMs as well.

引用次数: 0

Deep learning enhances the prediction of HLA class I-presented CD8+ T cell epitopes in foreign pathogens

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-01-28 DOI: 10.1038/s42256-024-00971-y

Jeremy Wohlwend, Anusha Nathan, Nitan Shalon, Charles R. Crain, Rhoda Tano-Menka, Benjamin Goldberg, Emma Richards, Gaurav D. Gaiha, Regina Barzilay

Accurate in silico determination of CD8+ T cell epitopes would greatly enhance T cell-based vaccine development, but current prediction models are not reliably successful. Here, motivated by recent successes applying machine learning to complex biology, we curated a dataset of 651,237 unique human leukocyte antigen class I (HLA-I) ligands and developed MUNIS, a deep learning model that identifies peptides presented by HLA-I alleles. MUNIS shows improved performance compared with existing models in predicting peptide presentation and CD8+ T cell epitope immunodominance hierarchies. Moreover, application of MUNIS to proteins from Epstein–Barr virus led to successful identification of both established and novel HLA-I epitopes which were experimentally validated by in vitro HLA-I-peptide stability and T cell immunogenicity assays. MUNIS performs comparably to an experimental stability assay in terms of immunogenicity prediction, suggesting that deep learning can reduce experimental burden and accelerate identification of CD8+ T cell epitopes for rapid T cell vaccine development. Accurate prediction of immunogenic CD8+ T cell epitopes would greatly accelerate T cell vaccine development. A new deep learning model, MUNIS, can rapidly identify HLA-binding, immunogenic and immunodominant peptides in foreign pathogens.

{"title":"Deep learning enhances the prediction of HLA class I-presented CD8+ T cell epitopes in foreign pathogens","authors":"Jeremy Wohlwend, Anusha Nathan, Nitan Shalon, Charles R. Crain, Rhoda Tano-Menka, Benjamin Goldberg, Emma Richards, Gaurav D. Gaiha, Regina Barzilay","doi":"10.1038/s42256-024-00971-y","DOIUrl":"10.1038/s42256-024-00971-y","url":null,"abstract":"Accurate in silico determination of CD8+ T cell epitopes would greatly enhance T cell-based vaccine development, but current prediction models are not reliably successful. Here, motivated by recent successes applying machine learning to complex biology, we curated a dataset of 651,237 unique human leukocyte antigen class I (HLA-I) ligands and developed MUNIS, a deep learning model that identifies peptides presented by HLA-I alleles. MUNIS shows improved performance compared with existing models in predicting peptide presentation and CD8+ T cell epitope immunodominance hierarchies. Moreover, application of MUNIS to proteins from Epstein–Barr virus led to successful identification of both established and novel HLA-I epitopes which were experimentally validated by in vitro HLA-I-peptide stability and T cell immunogenicity assays. MUNIS performs comparably to an experimental stability assay in terms of immunogenicity prediction, suggesting that deep learning can reduce experimental burden and accelerate identification of CD8+ T cell epitopes for rapid T cell vaccine development. Accurate prediction of immunogenic CD8+ T cell epitopes would greatly accelerate T cell vaccine development. A new deep learning model, MUNIS, can rapidly identify HLA-binding, immunogenic and immunodominant peptides in foreign pathogens.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"232-243"},"PeriodicalIF":18.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00971-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143050094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-01-28 DOI: 10.1038/s42256-024-00973-w

Chenpeng Yu, Xing Fang, Shiye Tian, Hui Liu

The immune checkpoint inhibitors have demonstrated promising clinical efficacy across various tumour types, yet the percentage of patients who benefit from them remains low. The bindings between tumour antigens and human leukocyte antigen class I/T cell receptor molecules determine the antigen presentation and T cell activation, thereby playing an important role in the immunotherapy response. In this paper, we propose UnifyImmun, a unified cross-attention transformer model designed to simultaneously predict the bindings of peptides to both receptors, providing more comprehensive evaluation of antigen immunogenicity. We devise a two-phase strategy using virtual adversarial training that enables these two tasks to reinforce each other mutually, by compelling the encoders to extract more expressive features. Our method demonstrates superior performance in predicting both peptide-HLA and peptide-TCR binding on multiple independent and external test sets. Notably, on a large-scale COVID-19 peptide-TCR binding test set without any seen peptide in the training set, our method outperforms the current state-of-the-art methods by more than 10%. The predicted binding scores significantly correlate with the immunotherapy response and clinical outcomes on two clinical cohorts. Furthermore, the cross-attention scores and integrated gradients reveal the amino acid sites critical for peptide binding to receptors. In essence, our approach marks an essential step towards comprehensive evaluation of antigen immunogenicity. This work proposes a deep learning model based on the cross-attention mechanism to simultaneously predict peptide–HLA and peptide–TCR bindings. Experiments verify that its performance for both prediction tasks on multiple test sets compares favourably with previous methods.

{"title":"A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules","authors":"Chenpeng Yu, Xing Fang, Shiye Tian, Hui Liu","doi":"10.1038/s42256-024-00973-w","DOIUrl":"10.1038/s42256-024-00973-w","url":null,"abstract":"The immune checkpoint inhibitors have demonstrated promising clinical efficacy across various tumour types, yet the percentage of patients who benefit from them remains low. The bindings between tumour antigens and human leukocyte antigen class I/T cell receptor molecules determine the antigen presentation and T cell activation, thereby playing an important role in the immunotherapy response. In this paper, we propose UnifyImmun, a unified cross-attention transformer model designed to simultaneously predict the bindings of peptides to both receptors, providing more comprehensive evaluation of antigen immunogenicity. We devise a two-phase strategy using virtual adversarial training that enables these two tasks to reinforce each other mutually, by compelling the encoders to extract more expressive features. Our method demonstrates superior performance in predicting both peptide-HLA and peptide-TCR binding on multiple independent and external test sets. Notably, on a large-scale COVID-19 peptide-TCR binding test set without any seen peptide in the training set, our method outperforms the current state-of-the-art methods by more than 10%. The predicted binding scores significantly correlate with the immunotherapy response and clinical outcomes on two clinical cohorts. Furthermore, the cross-attention scores and integrated gradients reveal the amino acid sites critical for peptide binding to receptors. In essence, our approach marks an essential step towards comprehensive evaluation of antigen immunogenicity. This work proposes a deep learning model based on the cross-attention mechanism to simultaneously predict peptide–HLA and peptide–TCR bindings. Experiments verify that its performance for both prediction tasks on multiple test sets compares favourably with previous methods.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"278-292"},"PeriodicalIF":18.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143050075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning solutions looking for PDE problems

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-01-27 DOI: 10.1038/s42256-025-00989-w

Machine learning models are promising approaches to tackle partial differential equations, which are foundational descriptions of many scientific and engineering problems. However, in speaking with several experts about progress in the area, questions are emerging over what realistic advantages machine learning models have and how their performance should be evaluated.

引用次数: 0

Evolutionary optimization of model merging recipes

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-01-27 DOI: 10.1038/s42256-024-00975-8

Takuya Akiba, Makoto Shing, Yujin Tang, Qi Sun, David Ha

Large language models (LLMs) have become increasingly capable, but their development often requires substantial computational resources. Although model merging has emerged as a cost-effective promising approach for creating new models by combining existing ones, it currently relies on human intuition and domain knowledge, limiting its potential. Here we propose an evolutionary approach that overcomes this limitation by automatically discovering effective combinations of diverse open-source models, harnessing their collective intelligence without requiring extensive additional training data or compute. Our approach operates in both parameter space and data flow space, allowing optimization beyond just the weights of the individual models. This approach even facilitates cross-domain merging, generating models such as a Japanese LLM with math reasoning capabilities. Surprisingly, our Japanese math LLM achieved state-of-the-art performance on a variety of established Japanese LLM benchmarks, even surpassing models with substantially more parameters, despite not being explicitly trained for such tasks. Furthermore, a culturally aware Japanese vision–language model generated through our approach demonstrates its effectiveness in describing Japanese culture-specific content, outperforming previous Japanese vision–language models. This work not only contributes new state-of-the-art models back to the open-source community but also introduces a new paradigm for automated model composition, paving the way for exploring alternative, efficient approaches to foundation model development. Akiba et al. developed an evolutionary approach to automatically merge artificial intelligence models, creating powerful hybrid models without extensive training. The method produces models with enhanced mathematical and visual capabilities that outperform larger models.

{"title":"Evolutionary optimization of model merging recipes","authors":"Takuya Akiba, Makoto Shing, Yujin Tang, Qi Sun, David Ha","doi":"10.1038/s42256-024-00975-8","DOIUrl":"10.1038/s42256-024-00975-8","url":null,"abstract":"Large language models (LLMs) have become increasingly capable, but their development often requires substantial computational resources. Although model merging has emerged as a cost-effective promising approach for creating new models by combining existing ones, it currently relies on human intuition and domain knowledge, limiting its potential. Here we propose an evolutionary approach that overcomes this limitation by automatically discovering effective combinations of diverse open-source models, harnessing their collective intelligence without requiring extensive additional training data or compute. Our approach operates in both parameter space and data flow space, allowing optimization beyond just the weights of the individual models. This approach even facilitates cross-domain merging, generating models such as a Japanese LLM with math reasoning capabilities. Surprisingly, our Japanese math LLM achieved state-of-the-art performance on a variety of established Japanese LLM benchmarks, even surpassing models with substantially more parameters, despite not being explicitly trained for such tasks. Furthermore, a culturally aware Japanese vision–language model generated through our approach demonstrates its effectiveness in describing Japanese culture-specific content, outperforming previous Japanese vision–language models. This work not only contributes new state-of-the-art models back to the open-source community but also introduces a new paradigm for automated model composition, paving the way for exploring alternative, efficient approaches to foundation model development. Akiba et al. developed an evolutionary approach to automatically merge artificial intelligence models, creating powerful hybrid models without extensive training. The method produces models with enhanced mathematical and visual capabilities that outperform larger models.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"195-204"},"PeriodicalIF":18.8,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00975-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143044077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Moving towards genome-wide data integration for patient stratification with Integrate Any Omics 通过整合任意组学，向患者分层的全基因组数据整合迈进

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-01-23 DOI: 10.1038/s42256-024-00942-3

Shihao Ma, Andy G. X. Zeng, Benjamin Haibe-Kains, Anna Goldenberg, John E. Dick, Bo Wang

High-throughput omics profiling advancements have greatly enhanced cancer patient stratification. However, incomplete data in multi-omics integration present a substantial challenge, as traditional methods like sample exclusion or imputation often compromise biological diversity and dependencies. Furthermore, the critical task of accurately classifying new patients with partial omics data into existing subtypes is commonly overlooked. To address these issues, we introduce Integrate Any Omics (IntegrAO), an unsupervised framework for integrating incomplete multi-omics data and classifying new samples. IntegrAO first combines partially overlapping patient graphs from diverse omics sources and utilizes graph neural networks to produce unified patient embeddings. Our systematic evaluation across five cancer cohorts involving six omics modalities demonstrates IntegrAO’s robustness to missing data and its accuracy in classifying new samples with partial profiles. An acute myeloid leukaemia case study further validates its capability to uncover biological and clinical heterogeneities in incomplete datasets. IntegrAO’s ability to handle heterogeneous and incomplete data makes it an essential tool for precision oncology, offering a holistic approach to patient characterization. Integrating incomplete multi-omics data remains a key challenge in precision oncology. IntegrAO, an unsupervised framework that integrates diverse omics, enables accurate patient classification even with incomplete datasets.

高通量组学分析的进步极大地增强了癌症患者的分层。然而，在多组学整合中，数据的不完整带来了巨大的挑战，因为传统的方法如样本排除或imputation往往会损害生物多样性和依赖性。此外，将具有部分组学数据的新患者准确分类为现有亚型的关键任务通常被忽视。为了解决这些问题，我们引入了集成任意组学（IntegrAO），这是一个用于集成不完整多组学数据和分类新样本的无监督框架。IntegrAO首先将来自不同组学来源的部分重叠的患者图组合在一起，并利用图神经网络生成统一的患者嵌入。我们对涉及六种组学模式的五个癌症队列进行了系统评估，证明了IntegrAO对缺失数据的鲁棒性及其在分类具有部分特征的新样本方面的准确性。急性髓性白血病病例研究进一步验证了其在不完整数据集中揭示生物学和临床异质性的能力。IntegrAO处理异构和不完整数据的能力使其成为精确肿瘤学的重要工具，为患者表征提供了整体方法。

{"title":"Moving towards genome-wide data integration for patient stratification with Integrate Any Omics","authors":"Shihao Ma, Andy G. X. Zeng, Benjamin Haibe-Kains, Anna Goldenberg, John E. Dick, Bo Wang","doi":"10.1038/s42256-024-00942-3","DOIUrl":"10.1038/s42256-024-00942-3","url":null,"abstract":"High-throughput omics profiling advancements have greatly enhanced cancer patient stratification. However, incomplete data in multi-omics integration present a substantial challenge, as traditional methods like sample exclusion or imputation often compromise biological diversity and dependencies. Furthermore, the critical task of accurately classifying new patients with partial omics data into existing subtypes is commonly overlooked. To address these issues, we introduce Integrate Any Omics (IntegrAO), an unsupervised framework for integrating incomplete multi-omics data and classifying new samples. IntegrAO first combines partially overlapping patient graphs from diverse omics sources and utilizes graph neural networks to produce unified patient embeddings. Our systematic evaluation across five cancer cohorts involving six omics modalities demonstrates IntegrAO’s robustness to missing data and its accuracy in classifying new samples with partial profiles. An acute myeloid leukaemia case study further validates its capability to uncover biological and clinical heterogeneities in incomplete datasets. IntegrAO’s ability to handle heterogeneous and incomplete data makes it an essential tool for precision oncology, offering a holistic approach to patient characterization. Integrating incomplete multi-omics data remains a key challenge in precision oncology. IntegrAO, an unsupervised framework that integrates diverse omics, enables accurate patient classification even with incomplete datasets.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 1","pages":"29-42"},"PeriodicalIF":18.8,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143020700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Author Correction: Kernel approximation using analogue in-memory computing

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-01-21 DOI: 10.1038/s42256-025-00996-x

Julian Büchel, Giacomo Camposampiero, Athanasios Vasilopoulos, Corey Lammie, Manuel Le Gallo, Abbas Rahimi, Abu Sebastian

引用次数: 0

What large language models know and what people think they know 大型语言模型知道什么，以及人们认为他们知道什么

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-01-21 DOI: 10.1038/s42256-024-00976-7

Mark Steyvers, Heliodoro Tejeda, Aakriti Kumar, Catarina Belem, Sheer Karny, Xinyue Hu, Lukas W. Mayer, Padhraic Smyth

As artificial intelligence systems, particularly large language models (LLMs), become increasingly integrated into decision-making processes, the ability to trust their outputs is crucial. To earn human trust, LLMs must be well calibrated such that they can accurately assess and communicate the likelihood of their predictions being correct. Whereas recent work has focused on LLMs’ internal confidence, less is understood about how effectively they convey uncertainty to users. Here we explore the calibration gap, which refers to the difference between human confidence in LLM-generated answers and the models’ actual confidence, and the discrimination gap, which reflects how well humans and models can distinguish between correct and incorrect answers. Our experiments with multiple-choice and short-answer questions reveal that users tend to overestimate the accuracy of LLM responses when provided with default explanations. Moreover, longer explanations increased user confidence, even when the extra length did not improve answer accuracy. By adjusting LLM explanations to better reflect the models’ internal confidence, both the calibration gap and the discrimination gap narrowed, significantly improving user perception of LLM accuracy. These findings underscore the importance of accurate uncertainty communication and highlight the effect of explanation length in influencing user trust in artificial-intelligence-assisted decision-making environments. Understanding how people perceive and interpret uncertainty from large language models (LLMs) is crucial, as users often overestimate LLM accuracy, especially with default explanations. Steyvers et al. show that aligning LLM explanations with their internal confidence improves user perception.

随着人工智能系统，特别是大型语言模型（llm）越来越多地融入决策过程，信任其输出的能力至关重要。为了赢得人们的信任，法学硕士必须经过良好的校准，以便他们能够准确地评估和传达他们预测正确的可能性。尽管最近的研究主要集中在法学硕士的内部信心上，但对他们如何有效地向用户传达不确定性的了解却很少。在这里，我们探讨了校准差距，这是指人类对llm生成的答案的置信度与模型的实际置信度之间的差异，以及区分差距，这反映了人类和模型区分正确和错误答案的能力。我们对多项选择题和简答题的实验表明，当提供默认解释时，用户倾向于高估LLM回答的准确性。此外，更长的解释增加了用户的信心，即使额外的长度并没有提高答案的准确性。通过调整LLM解释以更好地反映模型的内部置信度，缩小了校准差距和判别差距，显著提高了用户对LLM准确性的感知。这些发现强调了准确的不确定性沟通的重要性，并强调了在人工智能辅助决策环境中，解释长度对用户信任的影响。

{"title":"What large language models know and what people think they know","authors":"Mark Steyvers, Heliodoro Tejeda, Aakriti Kumar, Catarina Belem, Sheer Karny, Xinyue Hu, Lukas W. Mayer, Padhraic Smyth","doi":"10.1038/s42256-024-00976-7","DOIUrl":"10.1038/s42256-024-00976-7","url":null,"abstract":"As artificial intelligence systems, particularly large language models (LLMs), become increasingly integrated into decision-making processes, the ability to trust their outputs is crucial. To earn human trust, LLMs must be well calibrated such that they can accurately assess and communicate the likelihood of their predictions being correct. Whereas recent work has focused on LLMs’ internal confidence, less is understood about how effectively they convey uncertainty to users. Here we explore the calibration gap, which refers to the difference between human confidence in LLM-generated answers and the models’ actual confidence, and the discrimination gap, which reflects how well humans and models can distinguish between correct and incorrect answers. Our experiments with multiple-choice and short-answer questions reveal that users tend to overestimate the accuracy of LLM responses when provided with default explanations. Moreover, longer explanations increased user confidence, even when the extra length did not improve answer accuracy. By adjusting LLM explanations to better reflect the models’ internal confidence, both the calibration gap and the discrimination gap narrowed, significantly improving user perception of LLM accuracy. These findings underscore the importance of accurate uncertainty communication and highlight the effect of explanation length in influencing user trust in artificial-intelligence-assisted decision-making environments. Understanding how people perceive and interpret uncertainty from large language models (LLMs) is crucial, as users often overestimate LLM accuracy, especially with default explanations. Steyvers et al. show that aligning LLM explanations with their internal confidence improves user perception.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"221-231"},"PeriodicalIF":18.8,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00976-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142991513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Nature Machine Intelligence

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀