Pub Date : 2024-11-19DOI: 10.1038/s42256-024-00904-9
Youcan Yan, Ahmed Zermane, Jia Pan, Abderrahmane Kheddar
Electronic skins integrating both normal and shear force per taxel have a wide range of applications across diverse fields, including robotics, haptics and health monitoring. Current multi-axis tactile sensors often present complexities in structure and fabrication or require an extensive calibration process, limiting their widespread applications. Here we report an electronic soft magnetic skin capable of self-decoupling three-axis forces at each taxel. We use a simple sensor structure with customizable sensitivity and measurement range, reducing the calibration complexity from known quadratic (N2) or cubic (N3) scales down to a linear (3N) scale. The three-axis self-decoupling property of the sensor is achieved by overlaying two sinusoidally magnetized flexible magnetic films with orthogonal magnetization patterns. Leveraging the self-decoupling feature and its simple structure, we demonstrate that our sensor can facilitate a diverse range of applications, such as measuring the three-dimensional force distribution in artificial knee joints, teaching robots by touch demonstration and monitoring the interaction forces between knee braces and human skin during various activities.
{"title":"A soft skin with self-decoupled three-axis force-sensing taxels","authors":"Youcan Yan, Ahmed Zermane, Jia Pan, Abderrahmane Kheddar","doi":"10.1038/s42256-024-00904-9","DOIUrl":"https://doi.org/10.1038/s42256-024-00904-9","url":null,"abstract":"<p>Electronic skins integrating both normal and shear force per taxel have a wide range of applications across diverse fields, including robotics, haptics and health monitoring. Current multi-axis tactile sensors often present complexities in structure and fabrication or require an extensive calibration process, limiting their widespread applications. Here we report an electronic soft magnetic skin capable of self-decoupling three-axis forces at each taxel. We use a simple sensor structure with customizable sensitivity and measurement range, reducing the calibration complexity from known quadratic (<i>N</i><sup>2</sup>) or cubic (<i>N</i><sup>3</sup>) scales down to a linear (3<i>N</i>) scale. The three-axis self-decoupling property of the sensor is achieved by overlaying two sinusoidally magnetized flexible magnetic films with orthogonal magnetization patterns. Leveraging the self-decoupling feature and its simple structure, we demonstrate that our sensor can facilitate a diverse range of applications, such as measuring the three-dimensional force distribution in artificial knee joints, teaching robots by touch demonstration and monitoring the interaction forces between knee braces and human skin during various activities.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"13 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142670830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19DOI: 10.1038/s42256-024-00928-1
Marko Njirjak, Lucija Žužić, Marko Babić, Patrizia Janković, Erik Otović, Daniela Kalafatovic, Goran Mauša
Supramolecular peptide-based materials have great potential for revolutionizing fields like nanotechnology and medicine. However, deciphering the intricate sequence-to-assembly pathway, essential for their real-life applications, remains a challenging endeavour. Their discovery relies primarily on empirical approaches that require substantial financial resources, impeding their disruptive potential. Consequently, despite the multitude of characterized self-assembling peptides and their demonstrated advantages, only a few peptide materials have found their way to the market. Machine learning trained on experimentally verified data presents a promising tool for quickly identifying sequences with a high propensity to self-assemble, thereby focusing resource expenditures on the most promising candidates. Here we introduce a framework that implements an accurate classifier in a metaheuristic-based generative model to navigate the search through the peptide sequence space of challenging size. For this purpose, we trained five recurrent neural networks among which the hybrid model that uses sequential information on aggregation propensity and specific physicochemical properties achieved a superior performance with 81.9% accuracy and 0.865 F1 score. Molecular dynamics simulations and experimental validation have confirmed the generative model to be 80–95% accurate in the discovery of self-assembling peptides, outperforming the current state-of-the-art models. The proposed modular framework efficiently complements human intuition in the exploration of self-assembling peptides and presents an important step in the development of intelligent laboratories for accelerated material discovery.
{"title":"Reshaping the discovery of self-assembling peptides with generative AI guided by hybrid deep learning","authors":"Marko Njirjak, Lucija Žužić, Marko Babić, Patrizia Janković, Erik Otović, Daniela Kalafatovic, Goran Mauša","doi":"10.1038/s42256-024-00928-1","DOIUrl":"https://doi.org/10.1038/s42256-024-00928-1","url":null,"abstract":"<p>Supramolecular peptide-based materials have great potential for revolutionizing fields like nanotechnology and medicine. However, deciphering the intricate sequence-to-assembly pathway, essential for their real-life applications, remains a challenging endeavour. Their discovery relies primarily on empirical approaches that require substantial financial resources, impeding their disruptive potential. Consequently, despite the multitude of characterized self-assembling peptides and their demonstrated advantages, only a few peptide materials have found their way to the market. Machine learning trained on experimentally verified data presents a promising tool for quickly identifying sequences with a high propensity to self-assemble, thereby focusing resource expenditures on the most promising candidates. Here we introduce a framework that implements an accurate classifier in a metaheuristic-based generative model to navigate the search through the peptide sequence space of challenging size. For this purpose, we trained five recurrent neural networks among which the hybrid model that uses sequential information on aggregation propensity and specific physicochemical properties achieved a superior performance with 81.9% accuracy and 0.865 F1 score. Molecular dynamics simulations and experimental validation have confirmed the generative model to be 80–95% accurate in the discovery of self-assembling peptides, outperforming the current state-of-the-art models. The proposed modular framework efficiently complements human intuition in the exploration of self-assembling peptides and presents an important step in the development of intelligent laboratories for accelerated material discovery.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"18 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142670829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-19DOI: 10.1038/s42256-024-00918-3
Solomon Asghar, Qing-Xiang Pei, Giorgio Volpe, Ran Ni
From physics and biology to seismology and economics, the behaviour of countless systems is determined by impactful yet unlikely transitions between metastable states known as rare events, the study of which is essential for understanding and controlling the properties of these systems. Classical computational methods to sample rare events remain prohibitively inefficient and are bottlenecks for enhanced samplers that require prior data. Here we introduce a physics-informed machine learning framework, normalizing Flow enhanced Rare Event Sampler (FlowRES), which uses unsupervised normalizing flow neural networks to enhance Monte Carlo sampling of rare events by generating high-quality non-local Monte Carlo proposals. We validated FlowRES by sampling the transition path ensembles of equilibrium and non-equilibrium systems of Brownian particles, exploring increasingly complex potentials. Beyond eliminating the requirements for prior data, FlowRES features key advantages over established samplers: no collective variables need to be defined, efficiency remains constant even as events become increasingly rare and systems with multiple routes between states can be straightforwardly simulated.
{"title":"Efficient rare event sampling with unsupervised normalizing flows","authors":"Solomon Asghar, Qing-Xiang Pei, Giorgio Volpe, Ran Ni","doi":"10.1038/s42256-024-00918-3","DOIUrl":"https://doi.org/10.1038/s42256-024-00918-3","url":null,"abstract":"<p>From physics and biology to seismology and economics, the behaviour of countless systems is determined by impactful yet unlikely transitions between metastable states known as rare events, the study of which is essential for understanding and controlling the properties of these systems. Classical computational methods to sample rare events remain prohibitively inefficient and are bottlenecks for enhanced samplers that require prior data. Here we introduce a physics-informed machine learning framework, normalizing Flow enhanced Rare Event Sampler (FlowRES), which uses unsupervised normalizing flow neural networks to enhance Monte Carlo sampling of rare events by generating high-quality non-local Monte Carlo proposals. We validated FlowRES by sampling the transition path ensembles of equilibrium and non-equilibrium systems of Brownian particles, exploring increasingly complex potentials. Beyond eliminating the requirements for prior data, FlowRES features key advantages over established samplers: no collective variables need to be defined, efficiency remains constant even as events become increasingly rare and systems with multiple routes between states can be straightforwardly simulated.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"251 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142673948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
On 12 September 2024, OpenAI released two new large language models (LLMs) — o1-preview and o1-mini — marking an important shift in the competitive landscape of commercial LLMs, particularly concerning their reasoning capabilities. Since the introduction of GPT-3.5, OpenAI has launched 31 LLMs in two years. Researchers are rapidly applying these evolving commercial models in clinical medicine, achieving results that sometimes exceed human performance in specific tasks. Although such success is encouraging, the development of the models used for these tasks may not align with the characteristics and needs of clinical practice.
LLMs can be categorized as either open-source or closed-source. Open-source models, such as Meta’s Llama, allow developers to access source code, training data and documentation freely. By contrast, closed-source models are accessed only through official channels or application programming interfaces (APIs). Initially, open-source models dominated the LLM landscape, until the release of OpenAI’s GPT-3 in 20201, which attracted considerable commercial interest and shifted focus towards closed-source approaches2.
{"title":"Clinical large language models with misplaced focus","authors":"Zining Luo, Haowei Ma, Zhiwu Li, Yuquan Chen, Yixin Sun, Aimin Hu, Jiang Yu, Yang Qiao, Junxian Gu, Hongying Li, Xuxi Peng, Dunrui Wang, Ying Liu, Zhenglong Liu, Jiebin Xie, Zhen Jiang, Gang Tian","doi":"10.1038/s42256-024-00929-0","DOIUrl":"https://doi.org/10.1038/s42256-024-00929-0","url":null,"abstract":"<p>On 12 September 2024, OpenAI released two new large language models (LLMs) — o1-preview and o1-mini — marking an important shift in the competitive landscape of commercial LLMs, particularly concerning their reasoning capabilities. Since the introduction of GPT-3.5, OpenAI has launched 31 LLMs in two years. Researchers are rapidly applying these evolving commercial models in clinical medicine, achieving results that sometimes exceed human performance in specific tasks. Although such success is encouraging, the development of the models used for these tasks may not align with the characteristics and needs of clinical practice.</p><p>LLMs can be categorized as either open-source or closed-source. Open-source models, such as Meta’s Llama, allow developers to access source code, training data and documentation freely. By contrast, closed-source models are accessed only through official channels or application programming interfaces (APIs). Initially, open-source models dominated the LLM landscape, until the release of OpenAI’s GPT-3 in 2020<sup>1</sup>, which attracted considerable commercial interest and shifted focus towards closed-source approaches<sup>2</sup>.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"18 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142670260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-15DOI: 10.1038/s42256-024-00920-9
Zaixi Zhang, Wan Xiang Shen, Qi Liu, Marinka Zitnik
Designing protein-binding proteins is critical for drug discovery. However, artificial-intelligence-based design of such proteins is challenging due to the complexity of protein–ligand interactions, the flexibility of ligand molecules and amino acid side chains, and sequence–structure dependencies. We introduce PocketGen, a deep generative model that produces residue sequence and atomic structure of the protein regions in which ligand interactions occur. PocketGen promotes consistency between protein sequence and structure by using a graph transformer for structural encoding and a sequence refinement module based on a protein language model. The graph transformer captures interactions at multiple scales, including atom, residue and ligand levels. For sequence refinement, PocketGen integrates a structural adapter into the protein language model, ensuring that structure-based predictions align with sequence-based predictions. PocketGen can generate high-fidelity protein pockets with enhanced binding affinity and structural validity. It operates ten times faster than physics-based methods and achieves a 97% success rate, defined as the percentage of generated pockets with higher binding affinity than reference pockets. Additionally, it attains an amino acid recovery rate exceeding 63%.
{"title":"Efficient generation of protein pockets with PocketGen","authors":"Zaixi Zhang, Wan Xiang Shen, Qi Liu, Marinka Zitnik","doi":"10.1038/s42256-024-00920-9","DOIUrl":"https://doi.org/10.1038/s42256-024-00920-9","url":null,"abstract":"<p>Designing protein-binding proteins is critical for drug discovery. However, artificial-intelligence-based design of such proteins is challenging due to the complexity of protein–ligand interactions, the flexibility of ligand molecules and amino acid side chains, and sequence–structure dependencies. We introduce PocketGen, a deep generative model that produces residue sequence and atomic structure of the protein regions in which ligand interactions occur. PocketGen promotes consistency between protein sequence and structure by using a graph transformer for structural encoding and a sequence refinement module based on a protein language model. The graph transformer captures interactions at multiple scales, including atom, residue and ligand levels. For sequence refinement, PocketGen integrates a structural adapter into the protein language model, ensuring that structure-based predictions align with sequence-based predictions. PocketGen can generate high-fidelity protein pockets with enhanced binding affinity and structural validity. It operates ten times faster than physics-based methods and achieves a 97% success rate, defined as the percentage of generated pockets with higher binding affinity than reference pockets. Additionally, it attains an amino acid recovery rate exceeding 63%.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"21 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1038/s42256-024-00921-8
Distinguishing between real and fabricated facts has long been a societal challenge. As the Internet becomes increasingly littered with AI-generated content, the need for curation and safeguarding of high-quality data and information is more crucial than ever.
{"title":"Pick your AI poison","authors":"","doi":"10.1038/s42256-024-00921-8","DOIUrl":"10.1038/s42256-024-00921-8","url":null,"abstract":"Distinguishing between real and fabricated facts has long been a societal challenge. As the Internet becomes increasingly littered with AI-generated content, the need for curation and safeguarding of high-quality data and information is more crucial than ever.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1119-1119"},"PeriodicalIF":18.8,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00921-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142486863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deciphering the relationships between genes and complex traits can enhance our understanding of phenotypic variations and disease mechanisms. However, determining the specific roles of individual genes and quantifying their direct and indirect causal effects on complex traits remains a significant challenge. Here we present a framework (called Bayesian network genome-wide association studies (BN-GWAS)) to decipher the total and direct causal effects of individual genes. BN-GWAS leverages imputed expression profiles from GWAS and raw expression data from a reference dataset to construct a directed gene–gene–phenotype causal network. It allows gene expression and disease traits to be evaluated in different samples, significantly improving the flexibility and applicability of the approach. It can be extended to decipher the joint causal network of two or more traits, and exhibits high specificity and precision (positive predictive value), making it particularly useful for selecting genes for follow-up studies. We verified the feasibility and validity of BN-GWAS by extensive simulations and applications to 52 traits across 14 tissues in the UK Biobank, revealing insights into their genetic architectures, including the relative contributions of direct, indirect and mediating causal genes. The identified (direct) causal genes were significantly enriched for genes highlighted in the Open Targets database. Overall, BN-GWAS provides a flexible and powerful framework for elucidating the genetic basis of complex traits through a systems-level, causal inference approach. Genome-wide association studies generate extensive data, but interpreting these data remains challenging. A Bayesian-network-based method is presented that uses imputed and raw gene expression data to decipher the causal effects of individual genes.
{"title":"Estimation of causal effects of genes on complex traits using a Bayesian-network-based framework applied to GWAS data","authors":"Liangying Yin, Yaning Feng, Yujia Shi, Alexandria Lau, Jinghong Qiu, Pak-Chung Sham, Hon-Cheong So","doi":"10.1038/s42256-024-00906-7","DOIUrl":"10.1038/s42256-024-00906-7","url":null,"abstract":"Deciphering the relationships between genes and complex traits can enhance our understanding of phenotypic variations and disease mechanisms. However, determining the specific roles of individual genes and quantifying their direct and indirect causal effects on complex traits remains a significant challenge. Here we present a framework (called Bayesian network genome-wide association studies (BN-GWAS)) to decipher the total and direct causal effects of individual genes. BN-GWAS leverages imputed expression profiles from GWAS and raw expression data from a reference dataset to construct a directed gene–gene–phenotype causal network. It allows gene expression and disease traits to be evaluated in different samples, significantly improving the flexibility and applicability of the approach. It can be extended to decipher the joint causal network of two or more traits, and exhibits high specificity and precision (positive predictive value), making it particularly useful for selecting genes for follow-up studies. We verified the feasibility and validity of BN-GWAS by extensive simulations and applications to 52 traits across 14 tissues in the UK Biobank, revealing insights into their genetic architectures, including the relative contributions of direct, indirect and mediating causal genes. The identified (direct) causal genes were significantly enriched for genes highlighted in the Open Targets database. Overall, BN-GWAS provides a flexible and powerful framework for elucidating the genetic basis of complex traits through a systems-level, causal inference approach. Genome-wide association studies generate extensive data, but interpreting these data remains challenging. A Bayesian-network-based method is presented that uses imputed and raw gene expression data to decipher the causal effects of individual genes.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1231-1244"},"PeriodicalIF":18.8,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142443839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-07DOI: 10.1038/s42256-024-00908-5
Bohao Zou, Jingjing Wang, Yi Ding, Zhenmiao Zhang, Yufen Huang, Xiaodong Fang, Ka Chun Cheung, Simon See, Lu Zhang
Metagenome-assembled genomes (MAGs) offer valuable insights into the exploration of microbial dark matter using metagenomic sequencing data. However, there is growing concern that contamination in MAGs may substantially affect the results of downstream analysis. Current MAG decontamination tools primarily rely on marker genes and do not fully use the contextual information of genomic sequences. To overcome this limitation, we introduce Deepurify for MAG decontamination. Deepurify uses a multi-modal deep language model with contrastive learning to match microbial genomic sequences with their taxonomic lineages. It allocates contigs within a MAG to a MAG-separated tree and applies a tree traversal algorithm to partition MAGs into sub-MAGs, with the goal of maximizing the number of high- and medium-quality sub-MAGs. Here we show that Deepurify outperformed MDMclearer and MAGpurify on simulated data, CAMI datasets and real-world datasets with varying complexities. Deepurify increased the number of high-quality MAGs by 20.0% in soil, 45.1% in ocean, 45.5% in plants, 33.8% in freshwater and 28.5% in human faecal metagenomic sequencing datasets. Metagenome-assembled genomes (MAGs) provide insights into microbial dark matter, but contamination remains a concern for downstream analysis. Zou et al. develop a multi-modal deep language model that leverages microbial sequences to remove ‘unexpected’ contigs from MAGs. This approach is compatible with any contig binning tools and increases the number of high-quality bins.
{"title":"A multi-modal deep language model for contaminant removal from metagenome-assembled genomes","authors":"Bohao Zou, Jingjing Wang, Yi Ding, Zhenmiao Zhang, Yufen Huang, Xiaodong Fang, Ka Chun Cheung, Simon See, Lu Zhang","doi":"10.1038/s42256-024-00908-5","DOIUrl":"10.1038/s42256-024-00908-5","url":null,"abstract":"Metagenome-assembled genomes (MAGs) offer valuable insights into the exploration of microbial dark matter using metagenomic sequencing data. However, there is growing concern that contamination in MAGs may substantially affect the results of downstream analysis. Current MAG decontamination tools primarily rely on marker genes and do not fully use the contextual information of genomic sequences. To overcome this limitation, we introduce Deepurify for MAG decontamination. Deepurify uses a multi-modal deep language model with contrastive learning to match microbial genomic sequences with their taxonomic lineages. It allocates contigs within a MAG to a MAG-separated tree and applies a tree traversal algorithm to partition MAGs into sub-MAGs, with the goal of maximizing the number of high- and medium-quality sub-MAGs. Here we show that Deepurify outperformed MDMclearer and MAGpurify on simulated data, CAMI datasets and real-world datasets with varying complexities. Deepurify increased the number of high-quality MAGs by 20.0% in soil, 45.1% in ocean, 45.5% in plants, 33.8% in freshwater and 28.5% in human faecal metagenomic sequencing datasets. Metagenome-assembled genomes (MAGs) provide insights into microbial dark matter, but contamination remains a concern for downstream analysis. Zou et al. develop a multi-modal deep language model that leverages microbial sequences to remove ‘unexpected’ contigs from MAGs. This approach is compatible with any contig binning tools and increases the number of high-quality bins.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1245-1255"},"PeriodicalIF":18.8,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142383814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-04DOI: 10.1038/s42256-024-00911-w
Cas Wognum, Jeremy R. Ash, Matteo Aldeghi, Raquel Rodríguez-Pérez, Cheng Fang, Alan C. Cheng, Daniel J. Price, Djork-Arné Clevert, Ola Engkvist, W. Patrick Walters
{"title":"A call for an industry-led initiative to critically assess machine learning for real-world drug discovery","authors":"Cas Wognum, Jeremy R. Ash, Matteo Aldeghi, Raquel Rodríguez-Pérez, Cheng Fang, Alan C. Cheng, Daniel J. Price, Djork-Arné Clevert, Ola Engkvist, W. Patrick Walters","doi":"10.1038/s42256-024-00911-w","DOIUrl":"10.1038/s42256-024-00911-w","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1120-1121"},"PeriodicalIF":18.8,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142374186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03DOI: 10.1038/s42256-024-00902-x
Guruprasad Raghavan, Bahey Tharwat, Surya Narayanan Hari, Dhruvil Satani, Rex Liu, Matt Thomson
Contemporary machine learning algorithms train artificial neural networks by setting network weights to a single optimized configuration through gradient descent on task-specific training data. The resulting networks can achieve human-level performance on natural language processing, image analysis and agent-based tasks, but lack the flexibility and robustness characteristic of human intelligence. Here we introduce a differential geometry framework—functionally invariant paths—that provides flexible and continuous adaptation of trained neural networks so that secondary tasks can be achieved beyond the main machine learning goal, including increased network sparsification and adversarial robustness. We formulate the weight space of a neural network as a curved Riemannian manifold equipped with a metric tensor whose spectrum defines low-rank subspaces in weight space that accommodate network adaptation without loss of prior knowledge. We formalize adaptation as movement along a geodesic path in weight space while searching for networks that accommodate secondary objectives. With modest computational resources, the functionally invariant path algorithm achieves performance comparable with or exceeding state-of-the-art methods including low-rank adaptation on continual learning, sparsification and adversarial robustness tasks for large language models (bidirectional encoder representations from transformers), vision transformers (ViT and DeIT) and convolutional neural networks. Machine learning often includes secondary objectives, such as sparsity or robustness. To reach these objectives efficiently, the training of a neural network has been interpreted as the exploration of functionally invariant paths in the parameter space.
{"title":"Engineering flexible machine learning systems by traversing functionally invariant paths","authors":"Guruprasad Raghavan, Bahey Tharwat, Surya Narayanan Hari, Dhruvil Satani, Rex Liu, Matt Thomson","doi":"10.1038/s42256-024-00902-x","DOIUrl":"10.1038/s42256-024-00902-x","url":null,"abstract":"Contemporary machine learning algorithms train artificial neural networks by setting network weights to a single optimized configuration through gradient descent on task-specific training data. The resulting networks can achieve human-level performance on natural language processing, image analysis and agent-based tasks, but lack the flexibility and robustness characteristic of human intelligence. Here we introduce a differential geometry framework—functionally invariant paths—that provides flexible and continuous adaptation of trained neural networks so that secondary tasks can be achieved beyond the main machine learning goal, including increased network sparsification and adversarial robustness. We formulate the weight space of a neural network as a curved Riemannian manifold equipped with a metric tensor whose spectrum defines low-rank subspaces in weight space that accommodate network adaptation without loss of prior knowledge. We formalize adaptation as movement along a geodesic path in weight space while searching for networks that accommodate secondary objectives. With modest computational resources, the functionally invariant path algorithm achieves performance comparable with or exceeding state-of-the-art methods including low-rank adaptation on continual learning, sparsification and adversarial robustness tasks for large language models (bidirectional encoder representations from transformers), vision transformers (ViT and DeIT) and convolutional neural networks. Machine learning often includes secondary objectives, such as sparsity or robustness. To reach these objectives efficiently, the training of a neural network has been interpreted as the exploration of functionally invariant paths in the parameter space.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 10","pages":"1179-1196"},"PeriodicalIF":18.8,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00902-x.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142369346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}