General-purpose foundation models for increased autonomy in robot-assisted surgery

IF 18.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Nature Machine Intelligence Pub Date : 2024-11-01 DOI:10.1038/s42256-024-00917-4
Samuel Schmidgall, Ji Woong Kim, Alan Kuntz, Ahmed Ezzat Ghazi, Axel Krieger
{"title":"General-purpose foundation models for increased autonomy in robot-assisted surgery","authors":"Samuel Schmidgall, Ji Woong Kim, Alan Kuntz, Ahmed Ezzat Ghazi, Axel Krieger","doi":"10.1038/s42256-024-00917-4","DOIUrl":null,"url":null,"abstract":"The dominant paradigm for end-to-end robot learning focuses on optimizing task-specific objectives that solve a single robotic problem such as picking up an object or reaching a target position. However, recent work on high-capacity models in robotics has shown promise towards being trained on large collections of diverse and task-agnostic datasets of video demonstrations. These models have shown impressive levels of generalization to unseen circumstances, especially as the amount of data and the model complexity scale. Surgical robot systems that learn from data have struggled to advance as quickly as other fields of robot learning for a few reasons: there is a lack of existing large-scale open-source data to train models; it is challenging to model the soft-body deformations that these robots work with during surgery because simulation cannot match the physical and visual complexity of biological tissue; and surgical robots risk harming patients when tested in clinical trials and require more extensive safety measures. This Perspective aims to provide a path towards increasing robot autonomy in robot-assisted surgery through the development of a multi-modal, multi-task, vision–language–action model for surgical robots. Ultimately, we argue that surgical robots are uniquely positioned to benefit from general-purpose models and provide four guiding actions towards increased autonomy in robot-assisted surgery. Schmidgall et al. describe a pathway for building general-purpose machine learning models for robot-assisted surgery, including mechanisms for avoiding risk and handing over control to surgeons, and improving safety and outcomes beyond demonstration data.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 11","pages":"1275-1283"},"PeriodicalIF":18.8000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.nature.com/articles/s42256-024-00917-4","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The dominant paradigm for end-to-end robot learning focuses on optimizing task-specific objectives that solve a single robotic problem such as picking up an object or reaching a target position. However, recent work on high-capacity models in robotics has shown promise towards being trained on large collections of diverse and task-agnostic datasets of video demonstrations. These models have shown impressive levels of generalization to unseen circumstances, especially as the amount of data and the model complexity scale. Surgical robot systems that learn from data have struggled to advance as quickly as other fields of robot learning for a few reasons: there is a lack of existing large-scale open-source data to train models; it is challenging to model the soft-body deformations that these robots work with during surgery because simulation cannot match the physical and visual complexity of biological tissue; and surgical robots risk harming patients when tested in clinical trials and require more extensive safety measures. This Perspective aims to provide a path towards increasing robot autonomy in robot-assisted surgery through the development of a multi-modal, multi-task, vision–language–action model for surgical robots. Ultimately, we argue that surgical robots are uniquely positioned to benefit from general-purpose models and provide four guiding actions towards increased autonomy in robot-assisted surgery. Schmidgall et al. describe a pathway for building general-purpose machine learning models for robot-assisted surgery, including mechanisms for avoiding risk and handing over control to surgeons, and improving safety and outcomes beyond demonstration data.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
提高机器人辅助手术自主性的通用基础模型
端到端机器人学习的主流模式侧重于优化特定任务目标,以解决单一的机器人问题,如拾取物体或到达目标位置。然而,最近在机器人大容量模型方面的研究表明,在大量不同的、与任务无关的视频演示数据集上进行训练很有前途。这些模型对未知环境的泛化程度令人印象深刻,尤其是在数据量和模型复杂度不断增加的情况下。从数据中学习的手术机器人系统一直难以像其他机器人学习领域那样快速发展,原因有以下几点:缺乏现有的大规模开源数据来训练模型;由于模拟无法与生物组织的物理和视觉复杂性相匹配,因此对这些机器人在手术过程中的软体变形进行建模具有挑战性;手术机器人在临床试验中存在伤害患者的风险,因此需要采取更广泛的安全措施。本视角旨在通过为手术机器人开发多模式、多任务、视觉-语言-动作模型,为提高机器人辅助手术中的机器人自主性提供一条途径。最终,我们认为手术机器人具有得天独厚的优势,可以从通用模型中获益,并为提高机器人辅助手术的自主性提供四项指导行动。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
36.90
自引率
2.10%
发文量
127
期刊介绍: Nature Machine Intelligence is a distinguished publication that presents original research and reviews on various topics in machine learning, robotics, and AI. Our focus extends beyond these fields, exploring their profound impact on other scientific disciplines, as well as societal and industrial aspects. We recognize limitless possibilities wherein machine intelligence can augment human capabilities and knowledge in domains like scientific exploration, healthcare, medical diagnostics, and the creation of safe and sustainable cities, transportation, and agriculture. Simultaneously, we acknowledge the emergence of ethical, social, and legal concerns due to the rapid pace of advancements. To foster interdisciplinary discussions on these far-reaching implications, Nature Machine Intelligence serves as a platform for dialogue facilitated through Comments, News Features, News & Views articles, and Correspondence. Our goal is to encourage a comprehensive examination of these subjects. Similar to all Nature-branded journals, Nature Machine Intelligence operates under the guidance of a team of skilled editors. We adhere to a fair and rigorous peer-review process, ensuring high standards of copy-editing and production, swift publication, and editorial independence.
期刊最新文献
A unified cross-attention model for predicting antigen binding specificity to both HLA and TCR molecules Deep learning enhances the prediction of HLA class I-presented CD8+ T cell epitopes in foreign pathogens Machine learning solutions looking for PDE problems Evolutionary optimization of model merging recipes Moving towards genome-wide data integration for patient stratification with Integrate Any Omics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1