首页 > 最新文献

Nature Machine Intelligence最新文献

英文 中文
Deep learning enhances the prediction of HLA class I-presented CD8+ T cell epitopes in foreign pathogens
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-28 DOI: 10.1038/s42256-024-00971-y
Jeremy Wohlwend, Anusha Nathan, Nitan Shalon, Charles R. Crain, Rhoda Tano-Menka, Benjamin Goldberg, Emma Richards, Gaurav D. Gaiha, Regina Barzilay

Accurate in silico determination of CD8+ T cell epitopes would greatly enhance T cell-based vaccine development, but current prediction models are not reliably successful. Here, motivated by recent successes applying machine learning to complex biology, we curated a dataset of 651,237 unique human leukocyte antigen class I (HLA-I) ligands and developed MUNIS, a deep learning model that identifies peptides presented by HLA-I alleles. MUNIS shows improved performance compared with existing models in predicting peptide presentation and CD8+ T cell epitope immunodominance hierarchies. Moreover, application of MUNIS to proteins from Epstein–Barr virus led to successful identification of both established and novel HLA-I epitopes which were experimentally validated by in vitro HLA-I-peptide stability and T cell immunogenicity assays. MUNIS performs comparably to an experimental stability assay in terms of immunogenicity prediction, suggesting that deep learning can reduce experimental burden and accelerate identification of CD8+ T cell epitopes for rapid T cell vaccine development.

{"title":"Deep learning enhances the prediction of HLA class I-presented CD8+ T cell epitopes in foreign pathogens","authors":"Jeremy Wohlwend, Anusha Nathan, Nitan Shalon, Charles R. Crain, Rhoda Tano-Menka, Benjamin Goldberg, Emma Richards, Gaurav D. Gaiha, Regina Barzilay","doi":"10.1038/s42256-024-00971-y","DOIUrl":"https://doi.org/10.1038/s42256-024-00971-y","url":null,"abstract":"<p>Accurate in silico determination of CD8<sup>+</sup> T cell epitopes would greatly enhance T cell-based vaccine development, but current prediction models are not reliably successful. Here, motivated by recent successes applying machine learning to complex biology, we curated a dataset of 651,237 unique human leukocyte antigen class I (HLA-I) ligands and developed MUNIS, a deep learning model that identifies peptides presented by HLA-I alleles. MUNIS shows improved performance compared with existing models in predicting peptide presentation and CD8<sup>+</sup> T cell epitope immunodominance hierarchies. Moreover, application of MUNIS to proteins from Epstein–Barr virus led to successful identification of both established and novel HLA-I epitopes which were experimentally validated by in vitro HLA-I-peptide stability and T cell immunogenicity assays. MUNIS performs comparably to an experimental stability assay in terms of immunogenicity prediction, suggesting that deep learning can reduce experimental burden and accelerate identification of CD8<sup>+</sup> T cell epitopes for rapid T cell vaccine development.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"20 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143050094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-28 DOI: 10.1038/s42256-024-00973-w
Chenpeng Yu, Xing Fang, Shiye Tian, Hui Liu

The immune checkpoint inhibitors have demonstrated promising clinical efficacy across various tumour types, yet the percentage of patients who benefit from them remains low. The bindings between tumour antigens and human leukocyte antigen class I/T cell receptor molecules determine the antigen presentation and T cell activation, thereby playing an important role in the immunotherapy response. In this paper, we propose UnifyImmun, a unified cross-attention transformer model designed to simultaneously predict the bindings of peptides to both receptors, providing more comprehensive evaluation of antigen immunogenicity. We devise a two-phase strategy using virtual adversarial training that enables these two tasks to reinforce each other mutually, by compelling the encoders to extract more expressive features. Our method demonstrates superior performance in predicting both peptide-HLA and peptide-TCR binding on multiple independent and external test sets. Notably, on a large-scale COVID-19 peptide-TCR binding test set without any seen peptide in the training set, our method outperforms the current state-of-the-art methods by more than 10%. The predicted binding scores significantly correlate with the immunotherapy response and clinical outcomes on two clinical cohorts. Furthermore, the cross-attention scores and integrated gradients reveal the amino acid sites critical for peptide binding to receptors. In essence, our approach marks an essential step towards comprehensive evaluation of antigen immunogenicity.

{"title":"A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules","authors":"Chenpeng Yu, Xing Fang, Shiye Tian, Hui Liu","doi":"10.1038/s42256-024-00973-w","DOIUrl":"https://doi.org/10.1038/s42256-024-00973-w","url":null,"abstract":"<p>The immune checkpoint inhibitors have demonstrated promising clinical efficacy across various tumour types, yet the percentage of patients who benefit from them remains low. The bindings between tumour antigens and human leukocyte antigen class I/T cell receptor molecules determine the antigen presentation and T cell activation, thereby playing an important role in the immunotherapy response. In this paper, we propose UnifyImmun, a unified cross-attention transformer model designed to simultaneously predict the bindings of peptides to both receptors, providing more comprehensive evaluation of antigen immunogenicity. We devise a two-phase strategy using virtual adversarial training that enables these two tasks to reinforce each other mutually, by compelling the encoders to extract more expressive features. Our method demonstrates superior performance in predicting both peptide-HLA and peptide-TCR binding on multiple independent and external test sets. Notably, on a large-scale COVID-19 peptide-TCR binding test set without any seen peptide in the training set, our method outperforms the current state-of-the-art methods by more than 10%. The predicted binding scores significantly correlate with the immunotherapy response and clinical outcomes on two clinical cohorts. Furthermore, the cross-attention scores and integrated gradients reveal the amino acid sites critical for peptide binding to receptors. In essence, our approach marks an essential step towards comprehensive evaluation of antigen immunogenicity.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"120 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143050075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning solutions looking for PDE problems
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-27 DOI: 10.1038/s42256-025-00989-w
Machine learning models are promising approaches to tackle partial differential equations, which are foundational descriptions of many scientific and engineering problems. However, in speaking with several experts about progress in the area, questions are emerging over what realistic advantages machine learning models have and how their performance should be evaluated.
{"title":"Machine learning solutions looking for PDE problems","authors":"","doi":"10.1038/s42256-025-00989-w","DOIUrl":"https://doi.org/10.1038/s42256-025-00989-w","url":null,"abstract":"Machine learning models are promising approaches to tackle partial differential equations, which are foundational descriptions of many scientific and engineering problems. However, in speaking with several experts about progress in the area, questions are emerging over what realistic advantages machine learning models have and how their performance should be evaluated.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"11 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143044076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evolutionary optimization of model merging recipes
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-27 DOI: 10.1038/s42256-024-00975-8
Takuya Akiba, Makoto Shing, Yujin Tang, Qi Sun, David Ha

Large language models (LLMs) have become increasingly capable, but their development often requires substantial computational resources. Although model merging has emerged as a cost-effective promising approach for creating new models by combining existing ones, it currently relies on human intuition and domain knowledge, limiting its potential. Here we propose an evolutionary approach that overcomes this limitation by automatically discovering effective combinations of diverse open-source models, harnessing their collective intelligence without requiring extensive additional training data or compute. Our approach operates in both parameter space and data flow space, allowing optimization beyond just the weights of the individual models. This approach even facilitates cross-domain merging, generating models such as a Japanese LLM with math reasoning capabilities. Surprisingly, our Japanese math LLM achieved state-of-the-art performance on a variety of established Japanese LLM benchmarks, even surpassing models with substantially more parameters, despite not being explicitly trained for such tasks. Furthermore, a culturally aware Japanese vision–language model generated through our approach demonstrates its effectiveness in describing Japanese culture-specific content, outperforming previous Japanese vision–language models. This work not only contributes new state-of-the-art models back to the open-source community but also introduces a new paradigm for automated model composition, paving the way for exploring alternative, efficient approaches to foundation model development.

{"title":"Evolutionary optimization of model merging recipes","authors":"Takuya Akiba, Makoto Shing, Yujin Tang, Qi Sun, David Ha","doi":"10.1038/s42256-024-00975-8","DOIUrl":"https://doi.org/10.1038/s42256-024-00975-8","url":null,"abstract":"<p>Large language models (LLMs) have become increasingly capable, but their development often requires substantial computational resources. Although model merging has emerged as a cost-effective promising approach for creating new models by combining existing ones, it currently relies on human intuition and domain knowledge, limiting its potential. Here we propose an evolutionary approach that overcomes this limitation by automatically discovering effective combinations of diverse open-source models, harnessing their collective intelligence without requiring extensive additional training data or compute. Our approach operates in both parameter space and data flow space, allowing optimization beyond just the weights of the individual models. This approach even facilitates cross-domain merging, generating models such as a Japanese LLM with math reasoning capabilities. Surprisingly, our Japanese math LLM achieved state-of-the-art performance on a variety of established Japanese LLM benchmarks, even surpassing models with substantially more parameters, despite not being explicitly trained for such tasks. Furthermore, a culturally aware Japanese vision–language model generated through our approach demonstrates its effectiveness in describing Japanese culture-specific content, outperforming previous Japanese vision–language models. This work not only contributes new state-of-the-art models back to the open-source community but also introduces a new paradigm for automated model composition, paving the way for exploring alternative, efficient approaches to foundation model development.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"38 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143044077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Moving towards genome-wide data integration for patient stratification with Integrate Any Omics 通过整合任意组学,向患者分层的全基因组数据整合迈进
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-23 DOI: 10.1038/s42256-024-00942-3
Shihao Ma, Andy G. X. Zeng, Benjamin Haibe-Kains, Anna Goldenberg, John E. Dick, Bo Wang

High-throughput omics profiling advancements have greatly enhanced cancer patient stratification. However, incomplete data in multi-omics integration present a substantial challenge, as traditional methods like sample exclusion or imputation often compromise biological diversity and dependencies. Furthermore, the critical task of accurately classifying new patients with partial omics data into existing subtypes is commonly overlooked. To address these issues, we introduce Integrate Any Omics (IntegrAO), an unsupervised framework for integrating incomplete multi-omics data and classifying new samples. IntegrAO first combines partially overlapping patient graphs from diverse omics sources and utilizes graph neural networks to produce unified patient embeddings. Our systematic evaluation across five cancer cohorts involving six omics modalities demonstrates IntegrAO’s robustness to missing data and its accuracy in classifying new samples with partial profiles. An acute myeloid leukaemia case study further validates its capability to uncover biological and clinical heterogeneities in incomplete datasets. IntegrAO’s ability to handle heterogeneous and incomplete data makes it an essential tool for precision oncology, offering a holistic approach to patient characterization.

高通量组学分析的进步极大地增强了癌症患者的分层。然而,在多组学整合中,数据的不完整带来了巨大的挑战,因为传统的方法如样本排除或imputation往往会损害生物多样性和依赖性。此外,将具有部分组学数据的新患者准确分类为现有亚型的关键任务通常被忽视。为了解决这些问题,我们引入了集成任意组学(IntegrAO),这是一个用于集成不完整多组学数据和分类新样本的无监督框架。IntegrAO首先将来自不同组学来源的部分重叠的患者图组合在一起,并利用图神经网络生成统一的患者嵌入。我们对涉及六种组学模式的五个癌症队列进行了系统评估,证明了IntegrAO对缺失数据的鲁棒性及其在分类具有部分特征的新样本方面的准确性。急性髓性白血病病例研究进一步验证了其在不完整数据集中揭示生物学和临床异质性的能力。IntegrAO处理异构和不完整数据的能力使其成为精确肿瘤学的重要工具,为患者表征提供了整体方法。
{"title":"Moving towards genome-wide data integration for patient stratification with Integrate Any Omics","authors":"Shihao Ma, Andy G. X. Zeng, Benjamin Haibe-Kains, Anna Goldenberg, John E. Dick, Bo Wang","doi":"10.1038/s42256-024-00942-3","DOIUrl":"https://doi.org/10.1038/s42256-024-00942-3","url":null,"abstract":"<p>High-throughput omics profiling advancements have greatly enhanced cancer patient stratification. However, incomplete data in multi-omics integration present a substantial challenge, as traditional methods like sample exclusion or imputation often compromise biological diversity and dependencies. Furthermore, the critical task of accurately classifying new patients with partial omics data into existing subtypes is commonly overlooked. To address these issues, we introduce Integrate Any Omics (IntegrAO), an unsupervised framework for integrating incomplete multi-omics data and classifying new samples. IntegrAO first combines partially overlapping patient graphs from diverse omics sources and utilizes graph neural networks to produce unified patient embeddings. Our systematic evaluation across five cancer cohorts involving six omics modalities demonstrates IntegrAO’s robustness to missing data and its accuracy in classifying new samples with partial profiles. An acute myeloid leukaemia case study further validates its capability to uncover biological and clinical heterogeneities in incomplete datasets. IntegrAO’s ability to handle heterogeneous and incomplete data makes it an essential tool for precision oncology, offering a holistic approach to patient characterization.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"75 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143020700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What large language models know and what people think they know 大型语言模型知道什么,以及人们认为他们知道什么
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-21 DOI: 10.1038/s42256-024-00976-7
Mark Steyvers, Heliodoro Tejeda, Aakriti Kumar, Catarina Belem, Sheer Karny, Xinyue Hu, Lukas W. Mayer, Padhraic Smyth

As artificial intelligence systems, particularly large language models (LLMs), become increasingly integrated into decision-making processes, the ability to trust their outputs is crucial. To earn human trust, LLMs must be well calibrated such that they can accurately assess and communicate the likelihood of their predictions being correct. Whereas recent work has focused on LLMs’ internal confidence, less is understood about how effectively they convey uncertainty to users. Here we explore the calibration gap, which refers to the difference between human confidence in LLM-generated answers and the models’ actual confidence, and the discrimination gap, which reflects how well humans and models can distinguish between correct and incorrect answers. Our experiments with multiple-choice and short-answer questions reveal that users tend to overestimate the accuracy of LLM responses when provided with default explanations. Moreover, longer explanations increased user confidence, even when the extra length did not improve answer accuracy. By adjusting LLM explanations to better reflect the models’ internal confidence, both the calibration gap and the discrimination gap narrowed, significantly improving user perception of LLM accuracy. These findings underscore the importance of accurate uncertainty communication and highlight the effect of explanation length in influencing user trust in artificial-intelligence-assisted decision-making environments.

随着人工智能系统,特别是大型语言模型(llm)越来越多地融入决策过程,信任其输出的能力至关重要。为了赢得人们的信任,法学硕士必须经过良好的校准,以便他们能够准确地评估和传达他们预测正确的可能性。尽管最近的研究主要集中在法学硕士的内部信心上,但对他们如何有效地向用户传达不确定性的了解却很少。在这里,我们探讨了校准差距,这是指人类对llm生成的答案的置信度与模型的实际置信度之间的差异,以及区分差距,这反映了人类和模型区分正确和错误答案的能力。我们对多项选择题和简答题的实验表明,当提供默认解释时,用户倾向于高估LLM回答的准确性。此外,更长的解释增加了用户的信心,即使额外的长度并没有提高答案的准确性。通过调整LLM解释以更好地反映模型的内部置信度,缩小了校准差距和判别差距,显著提高了用户对LLM准确性的感知。这些发现强调了准确的不确定性沟通的重要性,并强调了在人工智能辅助决策环境中,解释长度对用户信任的影响。
{"title":"What large language models know and what people think they know","authors":"Mark Steyvers, Heliodoro Tejeda, Aakriti Kumar, Catarina Belem, Sheer Karny, Xinyue Hu, Lukas W. Mayer, Padhraic Smyth","doi":"10.1038/s42256-024-00976-7","DOIUrl":"https://doi.org/10.1038/s42256-024-00976-7","url":null,"abstract":"<p>As artificial intelligence systems, particularly large language models (LLMs), become increasingly integrated into decision-making processes, the ability to trust their outputs is crucial. To earn human trust, LLMs must be well calibrated such that they can accurately assess and communicate the likelihood of their predictions being correct. Whereas recent work has focused on LLMs’ internal confidence, less is understood about how effectively they convey uncertainty to users. Here we explore the calibration gap, which refers to the difference between human confidence in LLM-generated answers and the models’ actual confidence, and the discrimination gap, which reflects how well humans and models can distinguish between correct and incorrect answers. Our experiments with multiple-choice and short-answer questions reveal that users tend to overestimate the accuracy of LLM responses when provided with default explanations. Moreover, longer explanations increased user confidence, even when the extra length did not improve answer accuracy. By adjusting LLM explanations to better reflect the models’ internal confidence, both the calibration gap and the discrimination gap narrowed, significantly improving user perception of LLM accuracy. These findings underscore the importance of accurate uncertainty communication and highlight the effect of explanation length in influencing user trust in artificial-intelligence-assisted decision-making environments.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"9 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142991513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified evolution-driven deep learning framework for virus variation driver prediction 用于病毒变异驱动因素预测的统一进化驱动深度学习框架
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-17 DOI: 10.1038/s42256-024-00966-9
Zhiwei Nie, Xudong Liu, Jie Chen, Zhennan Wang, Yutian Liu, Haorui Si, Tianyi Dong, Fan Xu, Guoli Song, Yu Wang, Peng Zhou, Wen Gao, Yonghong Tian

The increasing frequency of emerging viral infections necessitates a rapid human response, highlighting the cost-effectiveness of computational methods. However, existing computational approaches are limited by their input forms or incomplete functionalities, preventing a unified prediction of diverse virus variation drivers and hindering in-depth applications. To address this issue, we propose a unified evolution-driven framework for predicting virus variation drivers, named Evolution-driven Virus Variation Driver prediction (E2VD), which is guided by virus evolutionary traits. With evolution-inspired design, E2VD comprehensively and significantly outperforms state-of-the-art methods across various virus mutational driver prediction tasks. Moreover, E2VD effectively captures the fundamental patterns of virus evolution. It not only distinguishes different types of mutations but also accurately identifies rare beneficial mutations that are critical for viruses to survive, while maintaining generalization capabilities across different lineages of SARS-CoV-2 and different types of viruses. Importantly, with predicted biological drivers, E2VD perceives virus evolutionary trends in which potential high-risk mutation sites are accurately recommended. Overall, E2VD represents a unified, structure-free and interpretable approach for analysing and predicting viral evolutionary fitness, providing an ideal alternative to costly wet-lab measurements to accelerate responses to emerging viral infections.

新出现的病毒感染日益频繁,人类必须做出快速反应,这凸显了计算方法的成本效益。然而,现有的计算方法受限于其输入形式或不完整的功能,无法统一预测各种病毒变异驱动因素,也阻碍了其深入应用。为解决这一问题,我们提出了一个统一的进化驱动病毒变异驱动因素预测框架,命名为进化驱动病毒变异驱动因素预测(Evolution-driven Virus Variation Driver prediction,E2VD),该框架以病毒进化特征为指导。通过进化启发设计,E2VD 在各种病毒变异驱动因素预测任务中全面、显著地超越了最先进的方法。此外,E2VD 还能有效捕捉病毒进化的基本模式。它不仅能区分不同类型的突变,还能准确识别对病毒生存至关重要的罕见有益突变,同时还能在 SARS-CoV-2 的不同血统和不同类型病毒中保持泛化能力。重要的是,通过预测生物驱动因素,E2VD 可以感知病毒的进化趋势,准确推荐潜在的高风险突变位点。总之,E2VD 是分析和预测病毒进化适应性的一种统一、无结构和可解释的方法,为加速应对新出现的病毒感染提供了一种理想的方法,可替代昂贵的湿实验室测量。
{"title":"A unified evolution-driven deep learning framework for virus variation driver prediction","authors":"Zhiwei Nie, Xudong Liu, Jie Chen, Zhennan Wang, Yutian Liu, Haorui Si, Tianyi Dong, Fan Xu, Guoli Song, Yu Wang, Peng Zhou, Wen Gao, Yonghong Tian","doi":"10.1038/s42256-024-00966-9","DOIUrl":"https://doi.org/10.1038/s42256-024-00966-9","url":null,"abstract":"<p>The increasing frequency of emerging viral infections necessitates a rapid human response, highlighting the cost-effectiveness of computational methods. However, existing computational approaches are limited by their input forms or incomplete functionalities, preventing a unified prediction of diverse virus variation drivers and hindering in-depth applications. To address this issue, we propose a unified evolution-driven framework for predicting virus variation drivers, named Evolution-driven Virus Variation Driver prediction (E2VD), which is guided by virus evolutionary traits. With evolution-inspired design, E2VD comprehensively and significantly outperforms state-of-the-art methods across various virus mutational driver prediction tasks. Moreover, E2VD effectively captures the fundamental patterns of virus evolution. It not only distinguishes different types of mutations but also accurately identifies rare beneficial mutations that are critical for viruses to survive, while maintaining generalization capabilities across different lineages of SARS-CoV-2 and different types of viruses. Importantly, with predicted biological drivers, E2VD perceives virus evolutionary trends in which potential high-risk mutation sites are accurately recommended. Overall, E2VD represents a unified, structure-free and interpretable approach for analysing and predicting viral evolutionary fitness, providing an ideal alternative to costly wet-lab measurements to accelerate responses to emerging viral infections.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"30 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A quantitative analysis of knowledge-learning preferences in large language models in molecular science 分子科学中大语言模型中知识学习偏好的定量分析
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-17 DOI: 10.1038/s42256-024-00977-6
Pengfei Liu, Jun Tao, Zhixiang Ren

Deep learning has significantly advanced molecular modelling and design, enabling an efficient understanding and discovery of novel molecules. In particular, large language models introduce a fresh research paradigm to tackle scientific problems from a natural language processing perspective. Large language models significantly enhance our understanding and generation of molecules, often surpassing existing methods with their capabilities to decode and synthesize complex molecular patterns. However, two key issues remain: how to quantify the match between model and data modalities and how to identify the knowledge-learning preferences of models. To address these challenges, we propose a multimodal benchmark, named ChEBI-20-MM, and perform 1,263 experiments to assess the model’s compatibility with data modalities and knowledge acquisition. Through the modal transition probability matrix, we provide insights into the most suitable modalities for tasks. Furthermore, we introduce a statistically interpretable approach to discover context-specific knowledge mapping by localized feature filtering. Our analysis offers an exploration of the learning mechanism and paves the way for advancing large language models in molecular science.

深度学习具有显著的先进分子建模和设计,能够有效地理解和发现新分子。特别是,大型语言模型引入了一种新的研究范式,从自然语言处理的角度来解决科学问题。大型语言模型显著增强了我们对分子的理解和生成,通常超越现有的方法,具有解码和合成复杂分子模式的能力。然而,仍然存在两个关键问题:如何量化模型和数据模式之间的匹配以及如何识别模型的知识学习偏好。为了解决这些挑战,我们提出了一个名为ChEBI-20-MM的多模态基准,并进行了1,263个实验来评估模型与数据模态和知识获取的兼容性。通过模态转移概率矩阵,我们提供了最适合任务的模态的见解。此外,我们引入了一种统计可解释的方法,通过局部特征过滤来发现上下文特定的知识映射。我们的分析提供了对学习机制的探索,并为推进分子科学中的大型语言模型铺平了道路。
{"title":"A quantitative analysis of knowledge-learning preferences in large language models in molecular science","authors":"Pengfei Liu, Jun Tao, Zhixiang Ren","doi":"10.1038/s42256-024-00977-6","DOIUrl":"https://doi.org/10.1038/s42256-024-00977-6","url":null,"abstract":"<p>Deep learning has significantly advanced molecular modelling and design, enabling an efficient understanding and discovery of novel molecules. In particular, large language models introduce a fresh research paradigm to tackle scientific problems from a natural language processing perspective. Large language models significantly enhance our understanding and generation of molecules, often surpassing existing methods with their capabilities to decode and synthesize complex molecular patterns. However, two key issues remain: how to quantify the match between model and data modalities and how to identify the knowledge-learning preferences of models. To address these challenges, we propose a multimodal benchmark, named ChEBI-20-MM, and perform 1,263 experiments to assess the model’s compatibility with data modalities and knowledge acquisition. Through the modal transition probability matrix, we provide insights into the most suitable modalities for tasks. Furthermore, we introduce a statistically interpretable approach to discover context-specific knowledge mapping by localized feature filtering. Our analysis offers an exploration of the learning mechanism and paves the way for advancing large language models in molecular science.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"68 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning from models beyond fine-tuning 从模型中学习超越微调
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-16 DOI: 10.1038/s42256-024-00961-0
Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, Yonggang Wen, Dacheng Tao

Foundation models have demonstrated remarkable performance across various tasks, primarily due to their abilities to comprehend instructions and access extensive, high-quality data. These capabilities showcase the effectiveness of current foundation models and suggest a promising trajectory. Owing to multiple constraints, such as the extreme scarcity or inaccessibility of raw data used to train foundation models and the high cost of training large-scale foundation models from scratch, the use of pre-existing foundation models or application programming interfaces for downstream tasks has become a new research trend, which we call Learn from Model (LFM). LFM involves extracting and leveraging prior knowledge from foundation models through fine-tuning, editing and fusion methods and applying it to downstream tasks. We emphasize that maximizing the use of parametric knowledge in data-scarce scenarios is critical to LFM. Analysing the LFM paradigm can guide the selection of the most appropriate technology in a given scenario to minimize parameter storage and computational costs while improving the performance of foundation models on new tasks. This Review provides a comprehensive overview of current methods based on foundation models from the perspective of LFM.

基础模型在各种任务中表现出了卓越的性能,这主要归功于它们理解指令和获取大量高质量数据的能力。这些能力展示了当前基础模型的有效性,并预示着其发展前景广阔。由于多种限制因素,如用于训练基础模型的原始数据极度稀缺或无法获取,以及从头开始训练大规模基础模型的高昂成本,使用已有的基础模型或应用编程接口来完成下游任务已成为一种新的研究趋势,我们称之为从模型中学习(LFM)。LFM 包括通过微调、编辑和融合方法从基础模型中提取和利用先验知识,并将其应用于下游任务。我们强调,在数据稀缺的情况下最大限度地利用参数知识对 LFM 至关重要。对 LFM 范式进行分析可以指导在特定场景中选择最合适的技术,从而最大限度地降低参数存储和计算成本,同时提高基础模型在新任务中的性能。本综述从 LFM 的角度全面概述了当前基于地基模型的方法。
{"title":"Learning from models beyond fine-tuning","authors":"Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, Yonggang Wen, Dacheng Tao","doi":"10.1038/s42256-024-00961-0","DOIUrl":"https://doi.org/10.1038/s42256-024-00961-0","url":null,"abstract":"<p>Foundation models have demonstrated remarkable performance across various tasks, primarily due to their abilities to comprehend instructions and access extensive, high-quality data. These capabilities showcase the effectiveness of current foundation models and suggest a promising trajectory. Owing to multiple constraints, such as the extreme scarcity or inaccessibility of raw data used to train foundation models and the high cost of training large-scale foundation models from scratch, the use of pre-existing foundation models or application programming interfaces for downstream tasks has become a new research trend, which we call Learn from Model (LFM). LFM involves extracting and leveraging prior knowledge from foundation models through fine-tuning, editing and fusion methods and applying it to downstream tasks. We emphasize that maximizing the use of parametric knowledge in data-scarce scenarios is critical to LFM. Analysing the LFM paradigm can guide the selection of the most appropriate technology in a given scenario to minimize parameter storage and computational costs while improving the performance of foundation models on new tasks. This Review provides a comprehensive overview of current methods based on foundation models from the perspective of LFM.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"30 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A machine learning approach to leveraging electronic health records for enhanced omics analysis 利用电子健康记录增强组学分析的机器学习方法
IF 23.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-16 DOI: 10.1038/s42256-024-00974-9
Samson J. Mataraso, Camilo A. Espinosa, David Seong, S. Momsen Reincke, Eloise Berson, Jonathan D. Reiss, Yeasul Kim, Marc Ghanem, Chi-Hung Shu, Tomin James, Yuqi Tan, Sayane Shome, Ina A. Stelzer, Dorien Feyaerts, Ronald J. Wong, Gary M. Shaw, Martin S. Angst, Brice Gaudilliere, David K. Stevenson, Nima Aghaeepour

Omics studies produce a large number of measurements, enabling the development, validation and interpretation of systems-level biological models. Large cohorts are required to power these complex models; yet, the cohort size remains limited due to clinical and budgetary constraints. We introduce clinical and omics multimodal analysis enhanced with transfer learning (COMET), a machine learning framework that incorporates large, observational electronic health record databases and transfer learning to improve the analysis of small datasets from omics studies. By pretraining on electronic health record data and adaptively blending both early and late fusion strategies, COMET overcomes the limitations of existing multimodal machine learning methods. Using two independent datasets, we showed that COMET improved the predictive modelling performance and biological discovery compared with the analysis of omics data with traditional methods. By incorporating electronic health record data into omics analyses, COMET enables more precise patient classifications, beyond the simplistic binary reduction to cases and controls. This framework can be broadly applied to the analysis of multimodal omics studies and reveals more powerful biological insights from limited cohort sizes.

Omics 研究产生了大量的测量数据,有助于开发、验证和解释系统级生物模型。要建立这些复杂的模型,需要庞大的队列;然而,由于临床和预算限制,队列规模仍然有限。我们介绍了利用迁移学习增强的临床和 omics 多模态分析(COMET),这是一种机器学习框架,它结合了大型观察性电子健康记录数据库和迁移学习,以改进来自 omics 研究的小型数据集的分析。通过对电子健康记录数据进行预训练,并自适应地融合早期和晚期融合策略,COMET 克服了现有多模态机器学习方法的局限性。我们使用两个独立的数据集表明,与使用传统方法分析全息数据相比,COMET 提高了预测建模性能和生物发现能力。通过将电子健康记录数据纳入全息分析,COMET 实现了更精确的患者分类,而不是简单地将病例和对照进行二元还原。这一框架可广泛应用于多模态全息研究分析,并从有限的队列规模中揭示出更强大的生物学洞察力。
{"title":"A machine learning approach to leveraging electronic health records for enhanced omics analysis","authors":"Samson J. Mataraso, Camilo A. Espinosa, David Seong, S. Momsen Reincke, Eloise Berson, Jonathan D. Reiss, Yeasul Kim, Marc Ghanem, Chi-Hung Shu, Tomin James, Yuqi Tan, Sayane Shome, Ina A. Stelzer, Dorien Feyaerts, Ronald J. Wong, Gary M. Shaw, Martin S. Angst, Brice Gaudilliere, David K. Stevenson, Nima Aghaeepour","doi":"10.1038/s42256-024-00974-9","DOIUrl":"https://doi.org/10.1038/s42256-024-00974-9","url":null,"abstract":"<p>Omics studies produce a large number of measurements, enabling the development, validation and interpretation of systems-level biological models. Large cohorts are required to power these complex models; yet, the cohort size remains limited due to clinical and budgetary constraints. We introduce clinical and omics multimodal analysis enhanced with transfer learning (COMET), a machine learning framework that incorporates large, observational electronic health record databases and transfer learning to improve the analysis of small datasets from omics studies. By pretraining on electronic health record data and adaptively blending both early and late fusion strategies, COMET overcomes the limitations of existing multimodal machine learning methods. Using two independent datasets, we showed that COMET improved the predictive modelling performance and biological discovery compared with the analysis of omics data with traditional methods. By incorporating electronic health record data into omics analyses, COMET enables more precise patient classifications, beyond the simplistic binary reduction to cases and controls. This framework can be broadly applied to the analysis of multimodal omics studies and reveals more powerful biological insights from limited cohort sizes.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"49 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Nature Machine Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1