Nature Machine Intelligence最新文献

英文中文

Data-driven federated learning in drug discovery with knowledge distillation

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-03-05 DOI: 10.1038/s42256-025-00991-2

Thierry Hanser, Ernst Ahlberg, Alexander Amberg, Lennart T. Anger, Chris Barber, Richard J. Brennan, Alessandro Brigo, Annie Delaunois, Susanne Glowienke, Nigel Greene, Laura Johnston, Daniel Kuhn, Lara Kuhnke, Jean-François Marchaland, Wolfgang Muster, Jeffrey Plante, Friedrich Rippmann, Yogesh Sabnis, Friedemann Schmidt, Ruud van Deursen, Stéphane Werner, Angela White, Joerg Wichard, Tomoya Yukawa

A main challenge for artificial intelligence in scientific research is ensuring access to sufficient, high-quality data for the development of impactful models. Despite the abundance of public data, the most valuable knowledge often remains embedded within confidential corporate data silos. Although industries are increasingly open to sharing non-competitive insights, such collaboration is often constrained by the confidentiality of the underlying data. Federated learning makes it possible to share knowledge without compromising data privacy, but it has notable limitations. Here, we introduce FLuID (federated learning using information distillation), a data-centric application of federated distillation tailored to drug discovery aiming to preserve data privacy. We validate FLuID in two experiments, first involving public data simulating a virtual consortium and second in a real-world research collaboration between eight pharmaceutical companies. Although the alignment of the models with the partner specific domain remains challenging, the data-driven nature of FLuID offers several avenues to mitigate domain shift. FLuID fosters knowledge sharing among pharmaceutical organizations, paving the way for a new generation of models with enhanced performance and an expanded applicability domain in biological activity predictions.

{"title":"Data-driven federated learning in drug discovery with knowledge distillation","authors":"Thierry Hanser, Ernst Ahlberg, Alexander Amberg, Lennart T. Anger, Chris Barber, Richard J. Brennan, Alessandro Brigo, Annie Delaunois, Susanne Glowienke, Nigel Greene, Laura Johnston, Daniel Kuhn, Lara Kuhnke, Jean-François Marchaland, Wolfgang Muster, Jeffrey Plante, Friedrich Rippmann, Yogesh Sabnis, Friedemann Schmidt, Ruud van Deursen, Stéphane Werner, Angela White, Joerg Wichard, Tomoya Yukawa","doi":"10.1038/s42256-025-00991-2","DOIUrl":"https://doi.org/10.1038/s42256-025-00991-2","url":null,"abstract":"A main challenge for artificial intelligence in scientific research is ensuring access to sufficient, high-quality data for the development of impactful models. Despite the abundance of public data, the most valuable knowledge often remains embedded within confidential corporate data silos. Although industries are increasingly open to sharing non-competitive insights, such collaboration is often constrained by the confidentiality of the underlying data. Federated learning makes it possible to share knowledge without compromising data privacy, but it has notable limitations. Here, we introduce FLuID (federated learning using information distillation), a data-centric application of federated distillation tailored to drug discovery aiming to preserve data privacy. We validate FLuID in two experiments, first involving public data simulating a virtual consortium and second in a real-world research collaboration between eight pharmaceutical companies. Although the alignment of the models with the partner specific domain remains challenging, the data-driven nature of FLuID offers several avenues to mitigate domain shift. FLuID fosters knowledge sharing among pharmaceutical organizations, paving the way for a new generation of models with enhanced performance and an expanded applicability domain in biological activity predictions.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"211 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143546524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bridging the gap between machine confidence and human perceptions

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-03-03 DOI: 10.1038/s42256-025-01013-x

Ming Yin

Users often overestimate the accuracy of large language models (LLMs). A new approach examines user perceptions and finds that aligning LLM explanations with the models’ internal confidence improves user perception.

引用次数: 0

A unified deep framework for peptide–major histocompatibility complex–T cell receptor binding prediction

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-26 DOI: 10.1038/s42256-025-01002-0

Yunxiang Zhao, Jijun Yu, Yixin Su, You Shu, Enhao Ma, Jing Wang, Shuyang Jiang, Congwen Wei, Dongsheng Li, Zhen Huang, Gong Cheng, Hongguang Ren, Jiannan Feng

Antigen peptides that are presented by a major histocompatibility complex (MHC) and recognized by a T cell receptor (TCR) have an essential role in immunotherapy. Although substantial progress has been made in predicting MHC presentation, accurately predicting the binding interactions between antigen peptides, MHCs and TCRs remains a major computational challenge. In this paper, we propose a unified deep framework (called UniPMT) for peptide, MHC and TCR binding prediction to predict the binding between the peptide and the CDR3 of TCR β in general, presented by class I MHCs. UniPMT is comprehensively validated by a series of experiments and achieved state-of-the-art performance in the peptide–MHC–TCR, peptide–MHC and peptide–TCR binding prediction tasks with up to 15% improvements in area under the precision–recall curve taking the peptide–MHC–TCR binding prediction task as an example. In practical applications, UniPMT shows strong predictive power, correlates well with T cell clonal expansion and outperforms existing methods in neoantigen-specific binding prediction with up to 17.62% improvements in area under the precision–recall curve on experimentally validated datasets. Moreover, UniPMT provides interpretable insights into the identification of key binding sites and the quantification of peptide–MHC–TCR binding probabilities. In summary, UniPMT shows great potential to serve as a useful tool for antigen peptide discovery, disease immunotherapy and neoantigen vaccine design.

{"title":"A unified deep framework for peptide–major histocompatibility complex–T cell receptor binding prediction","authors":"Yunxiang Zhao, Jijun Yu, Yixin Su, You Shu, Enhao Ma, Jing Wang, Shuyang Jiang, Congwen Wei, Dongsheng Li, Zhen Huang, Gong Cheng, Hongguang Ren, Jiannan Feng","doi":"10.1038/s42256-025-01002-0","DOIUrl":"https://doi.org/10.1038/s42256-025-01002-0","url":null,"abstract":"Antigen peptides that are presented by a major histocompatibility complex (MHC) and recognized by a T cell receptor (TCR) have an essential role in immunotherapy. Although substantial progress has been made in predicting MHC presentation, accurately predicting the binding interactions between antigen peptides, MHCs and TCRs remains a major computational challenge. In this paper, we propose a unified deep framework (called UniPMT) for peptide, MHC and TCR binding prediction to predict the binding between the peptide and the CDR3 of TCR β in general, presented by class I MHCs. UniPMT is comprehensively validated by a series of experiments and achieved state-of-the-art performance in the peptide–MHC–TCR, peptide–MHC and peptide–TCR binding prediction tasks with up to 15% improvements in area under the precision–recall curve taking the peptide–MHC–TCR binding prediction task as an example. In practical applications, UniPMT shows strong predictive power, correlates well with T cell clonal expansion and outperforms existing methods in neoantigen-specific binding prediction with up to 17.62% improvements in area under the precision–recall curve on experimentally validated datasets. Moreover, UniPMT provides interpretable insights into the identification of key binding sites and the quantification of peptide–MHC–TCR binding probabilities. In summary, UniPMT shows great potential to serve as a useful tool for antigen peptide discovery, disease immunotherapy and neoantigen vaccine design.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"51 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143495337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large language models for scientific discovery in molecular property prediction

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-25 DOI: 10.1038/s42256-025-00994-z

Yizhen Zheng, Huan Yee Koh, Jiaxin Ju, Anh T. N. Nguyen, Lauren T. May, Geoffrey I. Webb, Shirui Pan

Large language models (LLMs) are a form of artificial intelligence system encapsulating vast knowledge in the form of natural language. These systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization and computer code generation. Although LLMs have seen initial applications in natural sciences, their potential for driving scientific discovery remains largely unexplored. In this work, we introduce LLM4SD, a framework designed to harness LLMs for driving scientific discovery in molecular property prediction by synthesizing knowledge from literature and inferring knowledge from scientific data. LLMs synthesize knowledge by extracting established information from scientific literature, such as molecular weight being key to predicting solubility. For inference, LLMs identify patterns in molecular data, particularly in Simplified Molecular Input Line Entry System-encoded structures, such as halogen-containing molecules being more likely to cross the blood–brain barrier. This information is presented as interpretable knowledge, enabling the transformation of molecules into feature vectors. By using these features with interpretable models such as random forest, LLM4SD can outperform the current state of the art across a range of benchmark tasks for predicting molecular properties. We foresee it providing interpretable and potentially new insights, aiding scientific discovery in molecular property prediction.

{"title":"Large language models for scientific discovery in molecular property prediction","authors":"Yizhen Zheng, Huan Yee Koh, Jiaxin Ju, Anh T. N. Nguyen, Lauren T. May, Geoffrey I. Webb, Shirui Pan","doi":"10.1038/s42256-025-00994-z","DOIUrl":"https://doi.org/10.1038/s42256-025-00994-z","url":null,"abstract":"Large language models (LLMs) are a form of artificial intelligence system encapsulating vast knowledge in the form of natural language. These systems are adept at numerous complex tasks including creative writing, storytelling, translation, question-answering, summarization and computer code generation. Although LLMs have seen initial applications in natural sciences, their potential for driving scientific discovery remains largely unexplored. In this work, we introduce LLM4SD, a framework designed to harness LLMs for driving scientific discovery in molecular property prediction by synthesizing knowledge from literature and inferring knowledge from scientific data. LLMs synthesize knowledge by extracting established information from scientific literature, such as molecular weight being key to predicting solubility. For inference, LLMs identify patterns in molecular data, particularly in Simplified Molecular Input Line Entry System-encoded structures, such as halogen-containing molecules being more likely to cross the blood–brain barrier. This information is presented as interpretable knowledge, enabling the transformation of molecules into feature vectors. By using these features with interpretable models such as random forest, LLM4SD can outperform the current state of the art across a range of benchmark tasks for predicting molecular properties. We foresee it providing interpretable and potentially new insights, aiding scientific discovery in molecular property prediction.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"14 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143486029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Teaching robots to build simulations of themselves

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-25 DOI: 10.1038/s42256-025-01006-w

Yuhang Hu, Jiong Lin, Hod Lipson

The emergence of vision catalysed a pivotal evolutionary advancement, enabling organisms not only to perceive but also to interact intelligently with their environment. This transformation is mirrored by the evolution of robotic systems, where the ability to leverage vision to simulate and predict their own dynamics marks a leap towards autonomy and self-awareness. Humans utilize vision to record experiences and internally simulate potential actions. For example, we can imagine that, if we stand up and raise our arms, the body will form a ‘T’ shape without physical movement. Similarly, simulation allows robots to plan and predict the outcomes of potential actions without execution. Here we introduce a self-supervised learning framework to enable robots to model and predict their morphology, kinematics and motor control using only brief raw video data, eliminating the need for extensive real-world data collection and kinematic priors. By observing their own movements, akin to humans watching their reflection in a mirror, robots learn an ability to simulate themselves and predict their spatial motion for various tasks. Our results demonstrate that this self-learned simulation not only enables accurate motion planning but also allows the robot to detect abnormalities and recover from damage.

{"title":"Teaching robots to build simulations of themselves","authors":"Yuhang Hu, Jiong Lin, Hod Lipson","doi":"10.1038/s42256-025-01006-w","DOIUrl":"https://doi.org/10.1038/s42256-025-01006-w","url":null,"abstract":"The emergence of vision catalysed a pivotal evolutionary advancement, enabling organisms not only to perceive but also to interact intelligently with their environment. This transformation is mirrored by the evolution of robotic systems, where the ability to leverage vision to simulate and predict their own dynamics marks a leap towards autonomy and self-awareness. Humans utilize vision to record experiences and internally simulate potential actions. For example, we can imagine that, if we stand up and raise our arms, the body will form a ‘T’ shape without physical movement. Similarly, simulation allows robots to plan and predict the outcomes of potential actions without execution. Here we introduce a self-supervised learning framework to enable robots to model and predict their morphology, kinematics and motor control using only brief raw video data, eliminating the need for extensive real-world data collection and kinematic priors. By observing their own movements, akin to humans watching their reflection in a mirror, robots learn an ability to simulate themselves and predict their spatial motion for various tasks. Our results demonstrate that this self-learned simulation not only enables accurate motion planning but also allows the robot to detect abnormalities and recover from damage.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"65 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143486028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Seeking visions for sustainable AI

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-24 DOI: 10.1038/s42256-025-01008-8

As countries around the world heavily invest in artificial intelligence (AI) and related infrastructure, the sustainable development of AI technology needs to be higher on the global agenda.

引用次数: 0

Goals as reward-producing programs

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-21 DOI: 10.1038/s42256-025-00981-4

Guy Davidson, Graham Todd, Julian Togelius, Todd M. Gureckis, Brenden M. Lake

People are remarkably capable of generating their own goals, beginning with child’s play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behaviour, models are still far from capturing the richness of everyday human goals. Here we bridge this gap by collecting a dataset of human-generated playful goals (in the form of scorable, single-player games), modelling them as reward-producing programs and generating novel human-like goals through program synthesis. Reward-producing programs capture the rich semantics of goals through symbolic operations that compose, add temporal constraints and allow program execution on behavioural traces to evaluate progress. To build a generative model of goals, we learn a fitness function over the infinite set of possible goal programs and sample novel goals with a quality-diversity algorithm. Human evaluators found that model-generated goals, when sampled from partitions of program space occupied by human examples, were indistinguishable from human-created games. We also discovered that our model’s internal fitness scores predict games that are evaluated as more fun to play and more human-like. To enable artificial agents to generate human-like goals, a model must capture the complexity and diversity of human goals. Davidson et al. model playful goals from a naturalistic experiment as reward-producing programs, mapping an agent’s behaviour to goal success. They then develop a computational model to generate diverse human-like goals.

{"title":"Goals as reward-producing programs","authors":"Guy Davidson, Graham Todd, Julian Togelius, Todd M. Gureckis, Brenden M. Lake","doi":"10.1038/s42256-025-00981-4","DOIUrl":"10.1038/s42256-025-00981-4","url":null,"abstract":"People are remarkably capable of generating their own goals, beginning with child’s play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behaviour, models are still far from capturing the richness of everyday human goals. Here we bridge this gap by collecting a dataset of human-generated playful goals (in the form of scorable, single-player games), modelling them as reward-producing programs and generating novel human-like goals through program synthesis. Reward-producing programs capture the rich semantics of goals through symbolic operations that compose, add temporal constraints and allow program execution on behavioural traces to evaluate progress. To build a generative model of goals, we learn a fitness function over the infinite set of possible goal programs and sample novel goals with a quality-diversity algorithm. Human evaluators found that model-generated goals, when sampled from partitions of program space occupied by human examples, were indistinguishable from human-created games. We also discovered that our model’s internal fitness scores predict games that are evaluated as more fun to play and more human-like. To enable artificial agents to generate human-like goals, a model must capture the complexity and diversity of human goals. Davidson et al. model playful goals from a naturalistic experiment as reward-producing programs, mapping an agent’s behaviour to goal success. They then develop a computational model to generate diverse human-like goals.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"205-220"},"PeriodicalIF":18.8,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143462924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scalable and robust DNA-based storage via coding theory and deep learning

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-21 DOI: 10.1038/s42256-025-01003-z

Daniella Bar-Lev, Itai Orr, Omer Sabary, Tuvi Etzion, Eitan Yaakobi

The global data sphere is expanding exponentially, projected to hit 180 zettabytes by 2025, whereas current technologies are not anticipated to scale at nearly the same rate. DNA-based storage emerges as a crucial solution to this gap, enabling digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced data durability and negligible power consumption to maintain data integrity. To access the data, an information retrieval process is employed, where some of the main bottlenecks are the scalability and accuracy, which have a natural tradeoff between the two. Here we show a modular and holistic approach that combines deep neural networks trained on simulated data, tensor product-based error-correcting codes and a safety margin mechanism into a single coherent pipeline. We demonstrated our solution on 3.1 MB of information using two different sequencing technologies. Our work improves upon the current leading solutions with a 3,200× increase in speed and a 40% improvement in accuracy and offers a code rate of 1.6 bits per base in a high-noise regime. In a broader sense, our work shows a viable path to commercial DNA storage solutions hindered by current information retrieval processes.

{"title":"Scalable and robust DNA-based storage via coding theory and deep learning","authors":"Daniella Bar-Lev, Itai Orr, Omer Sabary, Tuvi Etzion, Eitan Yaakobi","doi":"10.1038/s42256-025-01003-z","DOIUrl":"https://doi.org/10.1038/s42256-025-01003-z","url":null,"abstract":"The global data sphere is expanding exponentially, projected to hit 180 zettabytes by 2025, whereas current technologies are not anticipated to scale at nearly the same rate. DNA-based storage emerges as a crucial solution to this gap, enabling digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced data durability and negligible power consumption to maintain data integrity. To access the data, an information retrieval process is employed, where some of the main bottlenecks are the scalability and accuracy, which have a natural tradeoff between the two. Here we show a modular and holistic approach that combines deep neural networks trained on simulated data, tensor product-based error-correcting codes and a safety margin mechanism into a single coherent pipeline. We demonstrated our solution on 3.1 MB of information using two different sequencing technologies. Our work improves upon the current leading solutions with a 3,200× increase in speed and a 40% improvement in accuracy and offers a code rate of 1.6 bits per base in a high-noise regime. In a broader sense, our work shows a viable path to commercial DNA storage solutions hindered by current information retrieval processes.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143462923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Categorizing robots by performance fitness into the tree of robots

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-21 DOI: 10.1038/s42256-025-00995-y

Robin Jeanne Kirschner, Kübra Karacan, Alessandro Melone, Sami Haddadin

Robots are typically classified based on specific morphological features, like their kinematic structure. However, a complex interplay between morphology and intelligence shapes how well a robot performs processes. Just as delicate surgical procedures demand high dexterity and tactile precision, manual warehouse or construction work requires strength and endurance. These process requirements necessitate robot systems that provide a level of performance fitting the process. In this work, we introduce the tree of robots as a taxonomy to bridge the gap between morphological classification and process-based performance. It classifies robots based on their fitness to perform, for example, physical interaction processes. Using 11 industrial manipulators, we constructed the first part of the tree of robots based on a carefully deduced set of metrics reflecting fundamental robot capabilities for various industrial physical interaction processes. Through significance analysis, we identified substantial differences between the systems, grouping them via an expectation-maximization algorithm to create a fitness-based robot classification that is open for contributions and accessible.

{"title":"Categorizing robots by performance fitness into the tree of robots","authors":"Robin Jeanne Kirschner, Kübra Karacan, Alessandro Melone, Sami Haddadin","doi":"10.1038/s42256-025-00995-y","DOIUrl":"https://doi.org/10.1038/s42256-025-00995-y","url":null,"abstract":"Robots are typically classified based on specific morphological features, like their kinematic structure. However, a complex interplay between morphology and intelligence shapes how well a robot performs processes. Just as delicate surgical procedures demand high dexterity and tactile precision, manual warehouse or construction work requires strength and endurance. These process requirements necessitate robot systems that provide a level of performance fitting the process. In this work, we introduce the tree of robots as a taxonomy to bridge the gap between morphological classification and process-based performance. It classifies robots based on their fitness to perform, for example, physical interaction processes. Using 11 industrial manipulators, we constructed the first part of the tree of robots based on a carefully deduced set of metrics reflecting fundamental robot capabilities for various industrial physical interaction processes. Through significance analysis, we identified substantial differences between the systems, grouping them via an expectation-maximization algorithm to create a fitness-based robot classification that is open for contributions and accessible.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"20 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143462928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep lead optimization enveloped in protein pocket and its application in designing potent and selective ligands targeting LTK protein

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence

Pub Date : 2025-02-20 DOI: 10.1038/s42256-025-00997-w

Shicheng Chen, Odin Zhang, Chenran Jiang, Huifeng Zhao, Xujun Zhang, Mengting Chen, Yun Liu, Qun Su, Zhenxing Wu, Xinyue Wang, Wanglin Qu, Yuanyi Ye, Xin Chai, Ning Wang, Tianyue Wang, Yuan An, Guanlin Wu, Qianqian Yang, Jiean Chen, Wei Xie, Haitao Lin, Dan Li, Chang-Yu Hsieh, Yong Huang, Yu Kang, Tingjun Hou, Peichen Pan

Optimizing the chemical structure of promising drug candidates through systematic modifications to improve potency and physiochemical properties is a vital step in the drug discovery pipeline. In contrast to the well-established de novo generation schemes, computational methods specifically tailored for lead optimization remain largely underexplored. Prior models are often limited to addressing specific subtasks, such as generating two-dimensional molecular structures, while neglecting crucial protein–ligand interactions in three-dimensional space. To overcome these challenges, we propose Delete (Deep lead optimization enveloped in protein pocket), a one-stop solution for lead optimization by combining generative artificial intelligence and structure-based approaches. Our model can handle all subtasks of lead optimization through a unified deleting (masking) strategy, and it accounts for intricate pocket–ligand interactions through an equivariant network design. Statistical assessments and retrospective studies across individual subtasks demonstrate that Delete has an outstanding ability to craft molecules with superior protein-binding energy and reasonable drug-likeness using given fragments or atoms. Subsequently, we utilize Delete to design inhibitors targeting the previously identified LTK protein. Among the ligands designed by Delete, CA-B-1 is successfully validated as a potent (1.36 nM) and selective inhibitor by in vitro and in vivo experiments. This work represents a successful implementation of the powerful structure-based lead optimization model, Delete, for rapid and controllable rational drug design.

{"title":"Deep lead optimization enveloped in protein pocket and its application in designing potent and selective ligands targeting LTK protein","authors":"Shicheng Chen, Odin Zhang, Chenran Jiang, Huifeng Zhao, Xujun Zhang, Mengting Chen, Yun Liu, Qun Su, Zhenxing Wu, Xinyue Wang, Wanglin Qu, Yuanyi Ye, Xin Chai, Ning Wang, Tianyue Wang, Yuan An, Guanlin Wu, Qianqian Yang, Jiean Chen, Wei Xie, Haitao Lin, Dan Li, Chang-Yu Hsieh, Yong Huang, Yu Kang, Tingjun Hou, Peichen Pan","doi":"10.1038/s42256-025-00997-w","DOIUrl":"https://doi.org/10.1038/s42256-025-00997-w","url":null,"abstract":"Optimizing the chemical structure of promising drug candidates through systematic modifications to improve potency and physiochemical properties is a vital step in the drug discovery pipeline. In contrast to the well-established de novo generation schemes, computational methods specifically tailored for lead optimization remain largely underexplored. Prior models are often limited to addressing specific subtasks, such as generating two-dimensional molecular structures, while neglecting crucial protein–ligand interactions in three-dimensional space. To overcome these challenges, we propose Delete (Deep lead optimization enveloped in protein pocket), a one-stop solution for lead optimization by combining generative artificial intelligence and structure-based approaches. Our model can handle all subtasks of lead optimization through a unified deleting (masking) strategy, and it accounts for intricate pocket–ligand interactions through an equivariant network design. Statistical assessments and retrospective studies across individual subtasks demonstrate that Delete has an outstanding ability to craft molecules with superior protein-binding energy and reasonable drug-likeness using given fragments or atoms. Subsequently, we utilize Delete to design inhibitors targeting the previously identified LTK protein. Among the ligands designed by Delete, CA-B-1 is successfully validated as a potent (1.36 nM) and selective inhibitor by in vitro and in vivo experiments. This work represents a successful implementation of the powerful structure-based lead optimization model, Delete, for rapid and controllable rational drug design.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"58 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143452007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Nature Machine Intelligence

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀