Nature Machine Intelligence最新文献_第10页

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-21 DOI: 10.1038/s42256-024-00921-8

Distinguishing between real and fabricated facts has long been a societal challenge. As the Internet becomes increasingly littered with AI-generated content, the need for curation and safeguarding of high-quality data and information is more crucial than ever.

长期以来，区分真实与捏造的事实一直是一个社会难题。随着互联网上人工智能生成的内容越来越多，对高质量数据和信息的整理和保护比以往任何时候都更加重要。

引用次数: 0

Leveraging language model for advanced multiproperty molecular optimization via prompt engineering 利用语言模型，通过及时工程实现先进的多性能分子优化

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-21 DOI: 10.1038/s42256-024-00916-5

Zhenxing Wu, Odin Zhang, Xiaorui Wang, Li Fu, Huifeng Zhao, Jike Wang, Hongyan Du, Dejun Jiang, Yafeng Deng, Dongsheng Cao, Chang-Yu Hsieh, Tingjun Hou

Optimizing a candidate molecule’s physiochemical and functional properties has been a critical task in drug and material design. Although the non-trivial task of balancing multiple (potentially conflicting) optimization objectives is considered ideal for artificial intelligence, several technical challenges such as the scarcity of multiproperty-labelled training data have hindered the development of a satisfactory AI solution for a long time. Prompt-MolOpt is a tool for molecular optimization; it makes use of prompt-based embeddings, as used in large language models, to improve the transformer’s ability to optimize molecules for specific property adjustments. Notably, Prompt-MolOpt excels in working with limited multiproperty data (even under the zero-shot setting) by effectively generalizing causal relationships learned from single-property datasets. In comparative evaluations against established models such as JTNN, hierG2G and Modof, Prompt-MolOpt achieves over a 15% relative improvement in multiproperty optimization success rates compared with the leading Modof model. Furthermore, a variant of Prompt-MolOpt, named Prompt-MolOptP, can preserve the pharmacophores or any user-specified fragments under the structural transformation, further broadening its application scope. By constructing tailored optimization datasets, with the protocol introduced in this work, Prompt-MolOpt steers molecular optimization towards domain-relevant chemical spaces, enhancing the quality of the optimized molecules. Real-world tests, such as those involving blood–brain barrier permeability optimization, underscore its practical relevance. Prompt-MolOpt offers a versatile approach for multiproperty and multi-site molecular optimizations, suggesting its potential utility in chemistry research and drug and material discovery. Designing molecules in drug design is challenging as it requires optimizing multiple, potentially competing qualities. Wu and colleagues present a prompt-based molecule optimization method that can be trained from single-property data.

优化候选分子的物理化学和功能特性一直是药物和材料设计的关键任务。尽管平衡多个（可能相互冲突的）优化目标这一非同小可的任务被认为是人工智能的理想选择，但长期以来，多属性标记训练数据的稀缺等一些技术挑战阻碍了令人满意的人工智能解决方案的开发。Prompt-MolOpt 是一种用于分子优化的工具；它利用基于提示的嵌入（用于大型语言模型）来提高转换器针对特定属性调整优化分子的能力。值得注意的是，Prompt-MolOpt 在处理有限的多属性数据（即使是在零镜头设置下）时表现出色，它有效地概括了从单属性数据集中学到的因果关系。在与 JTNN、hierG2G 和 Modof 等成熟模型的比较评估中，与领先的 Modof 模型相比，Prompt-MolOpt 的多属性优化成功率相对提高了 15%。此外，被命名为 Prompt-MolOptP 的 Prompt-MolOpt 变体可以在结构转换中保留药效团或任何用户指定的片段，从而进一步拓宽了其应用范围。通过构建量身定制的优化数据集，Prompt-MolOpt 利用本工作中介绍的协议，将分子优化引向与领域相关的化学空间，提高了优化分子的质量。血脑屏障通透性优化等实际测试凸显了它的实用性。Prompt-MolOpt 为多性能和多位点分子优化提供了一种多功能方法，表明它在化学研究、药物和材料发现方面具有潜在的实用性。

{"title":"Leveraging language model for advanced multiproperty molecular optimization via prompt engineering","authors":"Zhenxing Wu, Odin Zhang, Xiaorui Wang, Li Fu, Huifeng Zhao, Jike Wang, Hongyan Du, Dejun Jiang, Yafeng Deng, Dongsheng Cao, Chang-Yu Hsieh, Tingjun Hou","doi":"10.1038/s42256-024-00916-5","DOIUrl":"10.1038/s42256-024-00916-5","url":null,"abstract":"Optimizing a candidate molecule’s physiochemical and functional properties has been a critical task in drug and material design. Although the non-trivial task of balancing multiple (potentially conflicting) optimization objectives is considered ideal for artificial intelligence, several technical challenges such as the scarcity of multiproperty-labelled training data have hindered the development of a satisfactory AI solution for a long time. Prompt-MolOpt is a tool for molecular optimization; it makes use of prompt-based embeddings, as used in large language models, to improve the transformer’s ability to optimize molecules for specific property adjustments. Notably, Prompt-MolOpt excels in working with limited multiproperty data (even under the zero-shot setting) by effectively generalizing causal relationships learned from single-property datasets. In comparative evaluations against established models such as JTNN, hierG2G and Modof, Prompt-MolOpt achieves over a 15% relative improvement in multiproperty optimization success rates compared with the leading Modof model. Furthermore, a variant of Prompt-MolOpt, named Prompt-MolOptP, can preserve the pharmacophores or any user-specified fragments under the structural transformation, further broadening its application scope. By constructing tailored optimization datasets, with the protocol introduced in this work, Prompt-MolOpt steers molecular optimization towards domain-relevant chemical spaces, enhancing the quality of the optimized molecules. Real-world tests, such as those involving blood–brain barrier permeability optimization, underscore its practical relevance. Prompt-MolOpt offers a versatile approach for multiproperty and multi-site molecular optimizations, suggesting its potential utility in chemistry research and drug and material discovery. Designing molecules in drug design is challenging as it requires optimizing multiple, potentially competing qualities. Wu and colleagues present a prompt-based molecule optimization method that can be trained from single-property data.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 11","pages":"1359-1369"},"PeriodicalIF":18.8,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142451998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimation of causal effects of genes on complex traits using a Bayesian-network-based framework applied to GWAS data 使用基于贝叶斯网络的框架估算基因对复杂性状的因果效应，并将其应用于 GWAS 数据

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-17 DOI: 10.1038/s42256-024-00906-7

Liangying Yin, Yaning Feng, Yujia Shi, Alexandria Lau, Jinghong Qiu, Pak-Chung Sham, Hon-Cheong So

Deciphering the relationships between genes and complex traits can enhance our understanding of phenotypic variations and disease mechanisms. However, determining the specific roles of individual genes and quantifying their direct and indirect causal effects on complex traits remains a significant challenge. Here we present a framework (called Bayesian network genome-wide association studies (BN-GWAS)) to decipher the total and direct causal effects of individual genes. BN-GWAS leverages imputed expression profiles from GWAS and raw expression data from a reference dataset to construct a directed gene–gene–phenotype causal network. It allows gene expression and disease traits to be evaluated in different samples, significantly improving the flexibility and applicability of the approach. It can be extended to decipher the joint causal network of two or more traits, and exhibits high specificity and precision (positive predictive value), making it particularly useful for selecting genes for follow-up studies. We verified the feasibility and validity of BN-GWAS by extensive simulations and applications to 52 traits across 14 tissues in the UK Biobank, revealing insights into their genetic architectures, including the relative contributions of direct, indirect and mediating causal genes. The identified (direct) causal genes were significantly enriched for genes highlighted in the Open Targets database. Overall, BN-GWAS provides a flexible and powerful framework for elucidating the genetic basis of complex traits through a systems-level, causal inference approach. Genome-wide association studies generate extensive data, but interpreting these data remains challenging. A Bayesian-network-based method is presented that uses imputed and raw gene expression data to decipher the causal effects of individual genes.

破译基因与复杂性状之间的关系可以加深我们对表型变异和疾病机理的理解。然而，确定单个基因的具体作用并量化它们对复杂性状的直接和间接因果效应仍然是一项重大挑战。在这里，我们提出了一个框架（称为贝叶斯网络全基因组关联研究（BN-GWAS））来解读单个基因的总体和直接因果效应。贝叶斯网络全基因组关联研究（BN-GWAS）利用来自全基因组关联研究的估算表达谱和来自参考数据集的原始表达数据构建有向基因-基因-表型因果网络。它允许在不同样本中评估基因表达和疾病性状，大大提高了该方法的灵活性和适用性。它可以扩展到解密两个或更多性状的联合因果网络，并表现出很高的特异性和精确性（阳性预测值），因此特别适用于选择基因进行后续研究。我们对英国生物库中 14 种组织的 52 个性状进行了大量模拟和应用，验证了 BN-GWAS 的可行性和有效性，揭示了这些性状的基因结构，包括直接、间接和中介因果基因的相对贡献。确定的（直接）因果基因明显富集于开放目标数据库中突出显示的基因。总之，BN-GWAS 为通过系统级因果推断方法阐明复杂性状的遗传基础提供了一个灵活而强大的框架。

{"title":"Estimation of causal effects of genes on complex traits using a Bayesian-network-based framework applied to GWAS data","authors":"Liangying Yin, Yaning Feng, Yujia Shi, Alexandria Lau, Jinghong Qiu, Pak-Chung Sham, Hon-Cheong So","doi":"10.1038/s42256-024-00906-7","DOIUrl":"10.1038/s42256-024-00906-7","url":null,"abstract":"Deciphering the relationships between genes and complex traits can enhance our understanding of phenotypic variations and disease mechanisms. However, determining the specific roles of individual genes and quantifying their direct and indirect causal effects on complex traits remains a significant challenge. Here we present a framework (called Bayesian network genome-wide association studies (BN-GWAS)) to decipher the total and direct causal effects of individual genes. BN-GWAS leverages imputed expression profiles from GWAS and raw expression data from a reference dataset to construct a directed gene–gene–phenotype causal network. It allows gene expression and disease traits to be evaluated in different samples, significantly improving the flexibility and applicability of the approach. It can be extended to decipher the joint causal network of two or more traits, and exhibits high specificity and precision (positive predictive value), making it particularly useful for selecting genes for follow-up studies. We verified the feasibility and validity of BN-GWAS by extensive simulations and applications to 52 traits across 14 tissues in the UK Biobank, revealing insights into their genetic architectures, including the relative contributions of direct, indirect and mediating causal genes. The identified (direct) causal genes were significantly enriched for genes highlighted in the Open Targets database. Overall, BN-GWAS provides a flexible and powerful framework for elucidating the genetic basis of complex traits through a systems-level, causal inference approach. Genome-wide association studies generate extensive data, but interpreting these data remains challenging. A Bayesian-network-based method is presented that uses imputed and raw gene expression data to decipher the causal effects of individual genes.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1231-1244"},"PeriodicalIF":18.8,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142443839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Blending neural operators and relaxation methods in PDE numerical solvers 在 PDE 数值求解器中融合神经算子和松弛方法

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-17 DOI: 10.1038/s42256-024-00910-x

Enrui Zhang, Adar Kahana, Alena Kopaničáková, Eli Turkel, Rishikesh Ranade, Jay Pathak, George Em Karniadakis

Neural networks suffer from spectral bias and have difficulty representing the high-frequency components of a function, whereas relaxation methods can resolve high frequencies efficiently but stall at moderate to low frequencies. We exploit the weaknesses of the two approaches by combining them synergistically to develop a fast numerical solver of partial differential equations (PDEs) at scale. Specifically, we propose HINTS, a hybrid, iterative, numerical and transferable solver by integrating a Deep Operator Network (DeepONet) with standard relaxation methods, leading to parallel efficiency and algorithmic scalability for a wide class of PDEs, not tractable with existing monolithic solvers. HINTS balances the convergence behaviour across the spectrum of eigenmodes by utilizing the spectral bias of DeepONet, resulting in a uniform convergence rate and hence exceptional performance of the hybrid solver overall. Moreover, HINTS applies to large-scale, multidimensional systems; it is flexible with regards to discretizations, computational domain and boundary conditions; and it can also be used to precondition Krylov methods. Neural-network-based solvers for partial differential equations (PDEs) suffer from difficulties tackling high-frequency modes when learning complex functions, whereas for classical solvers it is more difficult to handle low-frequency modes. Zhang and colleagues propose a hybrid numerical PDE solver by combining a Deep Operator Network with traditional relaxation methods, leading to balanced convergence across the eigenmode spectrum for a wide range of PDEs.

神经网络存在频谱偏差，难以表现函数的高频分量，而松弛法可以高效地解决高频问题，但在中低频时会停滞不前。我们利用这两种方法的弱点，将它们协同结合起来，开发了一种大规模偏微分方程 (PDE) 的快速数值求解器。具体来说，我们提出的 HINTS 是一种混合、迭代、数值和可转移求解器，它将深度运算器网络（DeepONet）与标准松弛方法整合在一起，从而提高了并行效率和算法可扩展性，适用于多种现有单片求解器无法解决的偏微分方程。HINTS 利用 DeepONet 的频谱偏差，平衡了整个特征模式频谱的收敛行为，从而实现了统一的收敛速率，使混合求解器的整体性能出类拔萃。此外，HINTS 还适用于大规模、多维系统；在离散化、计算域和边界条件方面具有灵活性；还可用于对 Krylov 方法进行预处理。

{"title":"Blending neural operators and relaxation methods in PDE numerical solvers","authors":"Enrui Zhang, Adar Kahana, Alena Kopaničáková, Eli Turkel, Rishikesh Ranade, Jay Pathak, George Em Karniadakis","doi":"10.1038/s42256-024-00910-x","DOIUrl":"10.1038/s42256-024-00910-x","url":null,"abstract":"Neural networks suffer from spectral bias and have difficulty representing the high-frequency components of a function, whereas relaxation methods can resolve high frequencies efficiently but stall at moderate to low frequencies. We exploit the weaknesses of the two approaches by combining them synergistically to develop a fast numerical solver of partial differential equations (PDEs) at scale. Specifically, we propose HINTS, a hybrid, iterative, numerical and transferable solver by integrating a Deep Operator Network (DeepONet) with standard relaxation methods, leading to parallel efficiency and algorithmic scalability for a wide class of PDEs, not tractable with existing monolithic solvers. HINTS balances the convergence behaviour across the spectrum of eigenmodes by utilizing the spectral bias of DeepONet, resulting in a uniform convergence rate and hence exceptional performance of the hybrid solver overall. Moreover, HINTS applies to large-scale, multidimensional systems; it is flexible with regards to discretizations, computational domain and boundary conditions; and it can also be used to precondition Krylov methods. Neural-network-based solvers for partial differential equations (PDEs) suffer from difficulties tackling high-frequency modes when learning complex functions, whereas for classical solvers it is more difficult to handle low-frequency modes. Zhang and colleagues propose a hybrid numerical PDE solver by combining a Deep Operator Network with traditional relaxation methods, leading to balanced convergence across the eigenmode spectrum for a wide range of PDEs.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 11","pages":"1303-1313"},"PeriodicalIF":18.8,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142443828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A multi-modal deep language model for contaminant removal from metagenome-assembled genomes 从元基因组组装基因组中清除污染物的多模式深度语言模型

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-07 DOI: 10.1038/s42256-024-00908-5

Bohao Zou, Jingjing Wang, Yi Ding, Zhenmiao Zhang, Yufen Huang, Xiaodong Fang, Ka Chun Cheung, Simon See, Lu Zhang

Metagenome-assembled genomes (MAGs) offer valuable insights into the exploration of microbial dark matter using metagenomic sequencing data. However, there is growing concern that contamination in MAGs may substantially affect the results of downstream analysis. Current MAG decontamination tools primarily rely on marker genes and do not fully use the contextual information of genomic sequences. To overcome this limitation, we introduce Deepurify for MAG decontamination. Deepurify uses a multi-modal deep language model with contrastive learning to match microbial genomic sequences with their taxonomic lineages. It allocates contigs within a MAG to a MAG-separated tree and applies a tree traversal algorithm to partition MAGs into sub-MAGs, with the goal of maximizing the number of high- and medium-quality sub-MAGs. Here we show that Deepurify outperformed MDMclearer and MAGpurify on simulated data, CAMI datasets and real-world datasets with varying complexities. Deepurify increased the number of high-quality MAGs by 20.0% in soil, 45.1% in ocean, 45.5% in plants, 33.8% in freshwater and 28.5% in human faecal metagenomic sequencing datasets. Metagenome-assembled genomes (MAGs) provide insights into microbial dark matter, but contamination remains a concern for downstream analysis. Zou et al. develop a multi-modal deep language model that leverages microbial sequences to remove ‘unexpected’ contigs from MAGs. This approach is compatible with any contig binning tools and increases the number of high-quality bins.

元基因组组装基因组（MAGs）为利用元基因组测序数据探索微生物暗物质提供了宝贵的见解。然而，人们越来越担心，MAGs 中的污染可能会严重影响下游分析的结果。目前的 MAG 净化工具主要依赖标记基因，不能充分利用基因组序列的上下文信息。为了克服这一局限，我们推出了用于 MAG 去污的 Deepurify。Deepurify 使用具有对比学习功能的多模态深度语言模型来匹配微生物基因组序列及其分类学系谱。它将 MAG 中的等位基因分配到一棵 MAG 分离树上，并应用树遍历算法将 MAG 划分为子 MAG，目的是最大限度地增加高质量和中等质量子 MAG 的数量。在这里，我们展示了 Deepurify 在模拟数据、CAMI 数据集和具有不同复杂性的真实世界数据集上的表现优于 MDMclearer 和 MAGpurify。在土壤、海洋、植物、淡水和人类粪便元基因组测序数据集中，Deepurify 使高质量 MAG 的数量分别增加了 20.0%、45.1%、45.5%、33.8% 和 28.5%。

{"title":"A multi-modal deep language model for contaminant removal from metagenome-assembled genomes","authors":"Bohao Zou, Jingjing Wang, Yi Ding, Zhenmiao Zhang, Yufen Huang, Xiaodong Fang, Ka Chun Cheung, Simon See, Lu Zhang","doi":"10.1038/s42256-024-00908-5","DOIUrl":"10.1038/s42256-024-00908-5","url":null,"abstract":"Metagenome-assembled genomes (MAGs) offer valuable insights into the exploration of microbial dark matter using metagenomic sequencing data. However, there is growing concern that contamination in MAGs may substantially affect the results of downstream analysis. Current MAG decontamination tools primarily rely on marker genes and do not fully use the contextual information of genomic sequences. To overcome this limitation, we introduce Deepurify for MAG decontamination. Deepurify uses a multi-modal deep language model with contrastive learning to match microbial genomic sequences with their taxonomic lineages. It allocates contigs within a MAG to a MAG-separated tree and applies a tree traversal algorithm to partition MAGs into sub-MAGs, with the goal of maximizing the number of high- and medium-quality sub-MAGs. Here we show that Deepurify outperformed MDMclearer and MAGpurify on simulated data, CAMI datasets and real-world datasets with varying complexities. Deepurify increased the number of high-quality MAGs by 20.0% in soil, 45.1% in ocean, 45.5% in plants, 33.8% in freshwater and 28.5% in human faecal metagenomic sequencing datasets. Metagenome-assembled genomes (MAGs) provide insights into microbial dark matter, but contamination remains a concern for downstream analysis. Zou et al. develop a multi-modal deep language model that leverages microbial sequences to remove ‘unexpected’ contigs from MAGs. This approach is compatible with any contig binning tools and increases the number of high-quality bins.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1245-1255"},"PeriodicalIF":18.8,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142383814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A call for an industry-led initiative to critically assess machine learning for real-world drug discovery 呼吁发起一项由行业主导的倡议，对机器学习在实际药物研发中的应用进行严格评估

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-04 DOI: 10.1038/s42256-024-00911-w

Cas Wognum, Jeremy R. Ash, Matteo Aldeghi, Raquel Rodríguez-Pérez, Cheng Fang, Alan C. Cheng, Daniel J. Price, Djork-Arné Clevert, Ola Engkvist, W. Patrick Walters

引用次数: 0

Engineering flexible machine learning systems by traversing functionally invariant paths 通过遍历功能不变路径来设计灵活的机器学习系统

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-03 DOI: 10.1038/s42256-024-00902-x

Guruprasad Raghavan, Bahey Tharwat, Surya Narayanan Hari, Dhruvil Satani, Rex Liu, Matt Thomson

Contemporary machine learning algorithms train artificial neural networks by setting network weights to a single optimized configuration through gradient descent on task-specific training data. The resulting networks can achieve human-level performance on natural language processing, image analysis and agent-based tasks, but lack the flexibility and robustness characteristic of human intelligence. Here we introduce a differential geometry framework—functionally invariant paths—that provides flexible and continuous adaptation of trained neural networks so that secondary tasks can be achieved beyond the main machine learning goal, including increased network sparsification and adversarial robustness. We formulate the weight space of a neural network as a curved Riemannian manifold equipped with a metric tensor whose spectrum defines low-rank subspaces in weight space that accommodate network adaptation without loss of prior knowledge. We formalize adaptation as movement along a geodesic path in weight space while searching for networks that accommodate secondary objectives. With modest computational resources, the functionally invariant path algorithm achieves performance comparable with or exceeding state-of-the-art methods including low-rank adaptation on continual learning, sparsification and adversarial robustness tasks for large language models (bidirectional encoder representations from transformers), vision transformers (ViT and DeIT) and convolutional neural networks. Machine learning often includes secondary objectives, such as sparsity or robustness. To reach these objectives efficiently, the training of a neural network has been interpreted as the exploration of functionally invariant paths in the parameter space.

当代机器学习算法通过对特定任务的训练数据进行梯度下降，将网络权重设置为单一优化配置，从而训练人工神经网络。由此产生的网络可以在自然语言处理、图像分析和基于代理的任务中实现人类水平的性能，但缺乏人类智能所特有的灵活性和鲁棒性。在这里，我们引入了一个微分几何框架--功能不变路径，它能对训练有素的神经网络进行灵活、持续的调整，从而实现主要机器学习目标之外的次要任务，包括增加网络稀疏性和对抗鲁棒性。我们将神经网络的权重空间表述为一个弯曲的黎曼流形，该流形配备了一个度量张量，其频谱定义了权重空间中的低秩子空间，可在不丢失先验知识的情况下适应网络。我们将适应性形式化为沿着权重空间中的大地路径移动，同时搜索可满足次要目标的网络。在计算资源有限的情况下，功能不变路径算法在大型语言模型（变换器的双向编码器表示）、视觉变换器（ViT 和 DeIT）和卷积神经网络的持续学习、稀疏化和对抗鲁棒性任务中，实现了与最先进方法（包括低阶自适应）相当甚至更高的性能。

{"title":"Engineering flexible machine learning systems by traversing functionally invariant paths","authors":"Guruprasad Raghavan, Bahey Tharwat, Surya Narayanan Hari, Dhruvil Satani, Rex Liu, Matt Thomson","doi":"10.1038/s42256-024-00902-x","DOIUrl":"10.1038/s42256-024-00902-x","url":null,"abstract":"Contemporary machine learning algorithms train artificial neural networks by setting network weights to a single optimized configuration through gradient descent on task-specific training data. The resulting networks can achieve human-level performance on natural language processing, image analysis and agent-based tasks, but lack the flexibility and robustness characteristic of human intelligence. Here we introduce a differential geometry framework—functionally invariant paths—that provides flexible and continuous adaptation of trained neural networks so that secondary tasks can be achieved beyond the main machine learning goal, including increased network sparsification and adversarial robustness. We formulate the weight space of a neural network as a curved Riemannian manifold equipped with a metric tensor whose spectrum defines low-rank subspaces in weight space that accommodate network adaptation without loss of prior knowledge. We formalize adaptation as movement along a geodesic path in weight space while searching for networks that accommodate secondary objectives. With modest computational resources, the functionally invariant path algorithm achieves performance comparable with or exceeding state-of-the-art methods including low-rank adaptation on continual learning, sparsification and adversarial robustness tasks for large language models (bidirectional encoder representations from transformers), vision transformers (ViT and DeIT) and convolutional neural networks. Machine learning often includes secondary objectives, such as sparsity or robustness. To reach these objectives efficiently, the training of a neural network has been interpreted as the exploration of functionally invariant paths in the parameter space.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1179-1196"},"PeriodicalIF":18.8,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00902-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142369346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Soft robotic shorts improve outdoor walking efficiency in older adults 柔软的机器人短裤提高了老年人的户外行走效率

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-10-01 DOI: 10.1038/s42256-024-00894-8

Enrica Tricomi, Francesco Missiroli, Michele Xiloyannis, Nicola Lotti, Xiaohui Zhang, Marios Stefanakis, Maximilian Theisen, Jürgen Bauer, Clemens Becker, Lorenzo Masia

Peoples'' walking efficiency declines as they grow older, posing constraints on mobility, and affecting independence and quality of life. Although wearable assistive technologies are recognized as a potential solution for age-related movement challenges, few have proven effective for older adults, predominantly within controlled laboratory experiments. Here we present WalkON, a pair of soft robotic shorts designed to enhance walking efficiency for older individuals by assisting hip flexion. The system features a compact and lightweight tendon-driven design, using a controller based on natural leg movements to autonomously assist leg propagation. To assess WalkON''s impact on daily walking, we initially conducted a technology assessment with young adults on a demanding outdoor uphill 500 m hiking trail. We then validated our findings with a group of older adults walking on a flat outdoor 400 m track. WalkON considerably reduced the metabolic cost of transport by 17.79% for young adults during uphill walking. At the same time, participants reported high perceived control over their voluntary movements (a self-reported mean score of 6.20 out of 7 on a Likert scale). Similarly, older adults reduced their metabolic cost by 10.48% when using WalkON during level ground walking, while retaining a strong sense of movement control (mean score of 6.09 out of 7). These findings emphasize the potential of wearable assistive devices to improve efficiency in outdoor walking, suggesting promising implications for promoting physical well-being and advancing mobility, particularly during the later stages of life. Walking efficiency declines in older adults. To address this challenge, Tricomi and colleagues present a pair of lightweight, soft robotic shorts that enhance walking efficiency for older adults by assisting leg mobility. This method improves energy efficiency on outdoor tracks while maintaining the users’ natural movement control.

随着年龄的增长，人们的行走效率会下降，从而对行动能力造成限制，并影响独立性和生活质量。尽管可穿戴辅助技术被认为是解决与年龄相关的运动难题的潜在方案，但很少有技术被证明对老年人有效，主要是在实验室对照实验中。在此，我们介绍一款名为 WalkON 的软机器人短裤，旨在通过辅助髋关节屈曲来提高老年人的行走效率。该系统采用小巧轻便的肌腱驱动设计，利用基于腿部自然运动的控制器自主辅助腿部伸展。为了评估 WalkON 对日常行走的影响，我们首先在要求苛刻的 500 米室外上坡徒步路径上对年轻成年人进行了技术评估。然后，我们又用一组在平坦的室外 400 米跑道上行走的老年人验证了我们的研究结果。在上坡行走过程中，WalkON 大大降低了青壮年的运输代谢成本，降低了 17.79%。与此同时，参与者对自己的自主运动有很高的控制感知（利克特量表的自我报告平均分为 6.20 分（满分 7 分））。同样，老年人在平地行走过程中使用 WalkON 时，新陈代谢成本降低了 10.48%，同时保持了很强的运动控制感（平均分为 6.09 分，满分为 7 分）。这些发现强调了可穿戴辅助设备在提高户外行走效率方面的潜力，为促进身体健康和提高行动能力（尤其是在晚年阶段）带来了希望。

{"title":"Soft robotic shorts improve outdoor walking efficiency in older adults","authors":"Enrica Tricomi, Francesco Missiroli, Michele Xiloyannis, Nicola Lotti, Xiaohui Zhang, Marios Stefanakis, Maximilian Theisen, Jürgen Bauer, Clemens Becker, Lorenzo Masia","doi":"10.1038/s42256-024-00894-8","DOIUrl":"10.1038/s42256-024-00894-8","url":null,"abstract":"Peoples'' walking efficiency declines as they grow older, posing constraints on mobility, and affecting independence and quality of life. Although wearable assistive technologies are recognized as a potential solution for age-related movement challenges, few have proven effective for older adults, predominantly within controlled laboratory experiments. Here we present WalkON, a pair of soft robotic shorts designed to enhance walking efficiency for older individuals by assisting hip flexion. The system features a compact and lightweight tendon-driven design, using a controller based on natural leg movements to autonomously assist leg propagation. To assess WalkON''s impact on daily walking, we initially conducted a technology assessment with young adults on a demanding outdoor uphill 500 m hiking trail. We then validated our findings with a group of older adults walking on a flat outdoor 400 m track. WalkON considerably reduced the metabolic cost of transport by 17.79% for young adults during uphill walking. At the same time, participants reported high perceived control over their voluntary movements (a self-reported mean score of 6.20 out of 7 on a Likert scale). Similarly, older adults reduced their metabolic cost by 10.48% when using WalkON during level ground walking, while retaining a strong sense of movement control (mean score of 6.09 out of 7). These findings emphasize the potential of wearable assistive devices to improve efficiency in outdoor walking, suggesting promising implications for promoting physical well-being and advancing mobility, particularly during the later stages of life. Walking efficiency declines in older adults. To address this challenge, Tricomi and colleagues present a pair of lightweight, soft robotic shorts that enhance walking efficiency for older adults by assisting leg mobility. This method improves energy efficiency on outdoor tracks while maintaining the users’ natural movement control.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1145-1155"},"PeriodicalIF":18.8,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00894-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142360330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sliding-attention transformer neural architecture for predicting T cell receptor–antigen–human leucocyte antigen binding 用于预测 T 细胞受体-抗原-人类白细胞抗原结合的滑动-注意转换器神经结构

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-09-27 DOI: 10.1038/s42256-024-00901-y

Ziyan Feng, Jingyang Chen, Youlong Hai, Xuelian Pang, Kun Zheng, Chenglong Xie, Xiujuan Zhang, Shengqing Li, Chengjuan Zhang, Kangdong Liu, Lili Zhu, Xiaoyong Hu, Shiliang Li, Jie Zhang, Kai Zhang, Honglin Li

Neoantigens are promising targets for immunotherapy by eliciting immune response and removing cancer cells with high specificity, low toxicity and ease of personalization. However, identifying effective neoantigens remains difficult because of the complex interactions among T cell receptors, antigens and human leucocyte antigen sequences. In this study, we integrate important physical and biological priors with the Transformer model and propose the physics-inspired sliding transformer (PISTE). In PISTE, the conventional, data-driven attention mechanism is replaced with physics-driven dynamics that steers the positioning of amino acid residues along the gradient field of their interactions. This allows navigating the intricate landscape of biosequence interactions intelligently, leading to improved accuracy in T cell receptor–antigen–human leucocyte antigen binding prediction and robust generalization to rare sequences. Furthermore, PISTE effectively recovers residue-level contact relationships even in the absence of three-dimensional structure training data. We applied PISTE in a multitude of immunogenic tumour types to pinpoint neoantigens and discern neoantigen-reactive T cells. In a prospective study of prostate cancer, 75% of the patients elicited immune responses through PISTE-predicted neoantigens. Predicting TCR–antigen–human leucocyte antigen binding opens the door to neoantigen identification. In this study, a physics-inspired sliding transformer (PISTE) system is used to guide the positioning of amino acid residues along the gradient field of their interactions, boosting binding prediction accuracy.

新抗原具有高特异性、低毒性和易于个性化等特点，可诱发免疫反应并清除癌细胞，是很有前途的免疫疗法靶点。然而，由于 T 细胞受体、抗原和人类白细胞抗原序列之间存在复杂的相互作用，识别有效的新抗原仍然十分困难。在这项研究中，我们将重要的物理和生物先验与变压器模型相结合，提出了物理启发滑动变压器（PISTE）。在 PISTE 中，传统的数据驱动注意力机制被物理驱动动力学所取代，物理驱动动力学引导氨基酸残基沿着其相互作用的梯度场进行定位。这样就能智能地浏览错综复杂的生物序列相互作用，从而提高 T 细胞受体-抗原-人类白细胞抗原结合预测的准确性，并对罕见序列进行稳健的泛化。此外，即使在没有三维结构训练数据的情况下，PISTE 也能有效地恢复残基级接触关系。我们将 PISTE 应用于多种免疫原性肿瘤类型，以确定新抗原并识别新抗原反应 T 细胞。在一项前瞻性前列腺癌研究中，75% 的患者通过 PISTE 预测的新抗原产生了免疫反应。

{"title":"Sliding-attention transformer neural architecture for predicting T cell receptor–antigen–human leucocyte antigen binding","authors":"Ziyan Feng, Jingyang Chen, Youlong Hai, Xuelian Pang, Kun Zheng, Chenglong Xie, Xiujuan Zhang, Shengqing Li, Chengjuan Zhang, Kangdong Liu, Lili Zhu, Xiaoyong Hu, Shiliang Li, Jie Zhang, Kai Zhang, Honglin Li","doi":"10.1038/s42256-024-00901-y","DOIUrl":"10.1038/s42256-024-00901-y","url":null,"abstract":"Neoantigens are promising targets for immunotherapy by eliciting immune response and removing cancer cells with high specificity, low toxicity and ease of personalization. However, identifying effective neoantigens remains difficult because of the complex interactions among T cell receptors, antigens and human leucocyte antigen sequences. In this study, we integrate important physical and biological priors with the Transformer model and propose the physics-inspired sliding transformer (PISTE). In PISTE, the conventional, data-driven attention mechanism is replaced with physics-driven dynamics that steers the positioning of amino acid residues along the gradient field of their interactions. This allows navigating the intricate landscape of biosequence interactions intelligently, leading to improved accuracy in T cell receptor–antigen–human leucocyte antigen binding prediction and robust generalization to rare sequences. Furthermore, PISTE effectively recovers residue-level contact relationships even in the absence of three-dimensional structure training data. We applied PISTE in a multitude of immunogenic tumour types to pinpoint neoantigens and discern neoantigen-reactive T cells. In a prospective study of prostate cancer, 75% of the patients elicited immune responses through PISTE-predicted neoantigens. Predicting TCR–antigen–human leucocyte antigen binding opens the door to neoantigen identification. In this study, a physics-inspired sliding transformer (PISTE) system is used to guide the positioning of amino acid residues along the gradient field of their interactions, boosting binding prediction accuracy.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1216-1230"},"PeriodicalIF":18.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00901-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142325411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning for data-centric epidemic forecasting 以数据为中心的流行病预测机器学习

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2024-09-27 DOI: 10.1038/s42256-024-00895-7

Alexander Rodríguez, Harshavardhan Kamarthi, Pulak Agarwal, Javen Ho, Mira Patel, Suchet Sapre, B. Aditya Prakash

The COVID-19 pandemic emphasized the importance of epidemic forecasting for decision makers in multiple domains, ranging from public health to the economy. Forecasting epidemic progression is a non-trivial task due to multiple confounding factors, such as human behaviour, pathogen dynamics and environmental conditions. However, the surge in research interest and initiatives from public health and funding agencies has fuelled the availability of new data sources that capture previously unobservable aspects of disease spread, paving the way for a spate of ‘data-centred’ computational solutions that show promise for enhancing our forecasting capabilities. Here we discuss various methodological and practical advances and introduce a conceptual framework to navigate through them. First we list relevant datasets, such as symptomatic online surveys, retail and commerce, mobility and genomics data. Next we consider methods, focusing on recent data-driven statistical and deep learning-based methods, as well as hybrid models that combine domain knowledge of mechanistic models with the flexibility of statistical approaches. We also discuss experiences and challenges that arise in the real-world deployment of these forecasting systems, including decision-making informed by forecasts. Finally, we highlight some challenges and open problems found across the forecasting pipeline to enable robust future pandemic preparedness. Forecasting epidemic progression is a complex task influenced by various factors, including human behaviour, pathogen dynamics and environmental conditions. Rodríguez, Kamarthi and colleagues provide a review of machine learning methods for epidemic forecasting from a data-centric computational perspective.

COVID-19 大流行强调了流行病预测对于从公共卫生到经济等多个领域的决策者的重要性。由于人类行为、病原体动态和环境条件等多种干扰因素的存在，预测流行病的发展是一项非同小可的任务。然而，公共卫生和资助机构的研究兴趣和举措激增，推动了新数据源的出现，这些数据源捕捉了疾病传播中以前无法观察到的方面，为一系列 "以数据为中心 "的计算解决方案铺平了道路，这些解决方案有望提高我们的预测能力。在此，我们将讨论各种方法论和实践方面的进展，并引入一个概念框架来指导这些进展。首先，我们列出了相关的数据集，如症状在线调查、零售和商业、流动性和基因组学数据。接下来，我们考虑了各种方法，重点是最近的数据驱动统计方法和基于深度学习的方法，以及将机理模型的领域知识与统计方法的灵活性相结合的混合模型。我们还讨论了在现实世界部署这些预测系统时出现的经验和挑战，包括根据预测做出决策。最后，我们强调了在整个预报流程中发现的一些挑战和未决问题，以确保未来大流行病的有力防备。

{"title":"Machine learning for data-centric epidemic forecasting","authors":"Alexander Rodríguez, Harshavardhan Kamarthi, Pulak Agarwal, Javen Ho, Mira Patel, Suchet Sapre, B. Aditya Prakash","doi":"10.1038/s42256-024-00895-7","DOIUrl":"10.1038/s42256-024-00895-7","url":null,"abstract":"The COVID-19 pandemic emphasized the importance of epidemic forecasting for decision makers in multiple domains, ranging from public health to the economy. Forecasting epidemic progression is a non-trivial task due to multiple confounding factors, such as human behaviour, pathogen dynamics and environmental conditions. However, the surge in research interest and initiatives from public health and funding agencies has fuelled the availability of new data sources that capture previously unobservable aspects of disease spread, paving the way for a spate of ‘data-centred’ computational solutions that show promise for enhancing our forecasting capabilities. Here we discuss various methodological and practical advances and introduce a conceptual framework to navigate through them. First we list relevant datasets, such as symptomatic online surveys, retail and commerce, mobility and genomics data. Next we consider methods, focusing on recent data-driven statistical and deep learning-based methods, as well as hybrid models that combine domain knowledge of mechanistic models with the flexibility of statistical approaches. We also discuss experiences and challenges that arise in the real-world deployment of these forecasting systems, including decision-making informed by forecasts. Finally, we highlight some challenges and open problems found across the forecasting pipeline to enable robust future pandemic preparedness. Forecasting epidemic progression is a complex task influenced by various factors, including human behaviour, pathogen dynamics and environmental conditions. Rodríguez, Kamarthi and colleagues provide a review of machine learning methods for epidemic forecasting from a data-centric computational perspective.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1122-1131"},"PeriodicalIF":18.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142325412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0