{"title":"Development, deployment and scaling of operating room-ready artificial intelligence for real-time surgical decision support","authors":"Sergey Protserov, Jaryd Hunter, Haochi Zhang, Pouria Mashouri, Caterina Masino, Michael Brudno, Amin Madani","doi":"10.1038/s41746-024-01225-2","DOIUrl":null,"url":null,"abstract":"Deep learning for computer vision can be leveraged for interpreting surgical scenes and providing surgeons with real-time guidance to avoid complications. However, neither generalizability nor scalability of computer-vision-based surgical guidance systems have been demonstrated, especially to geographic locations that lack hardware and infrastructure necessary for real-time inference. We propose a new equipment-agnostic framework for real-time use in operating suites. Using laparoscopic cholecystectomy and semantic segmentation models for predicting safe/dangerous (“Go”/”No-Go”) zones of dissection as an example use case, this study aimed to develop and test the performance of a novel data pipeline linked to a web-platform that enables real-time deployment from any edge device. To test this infrastructure and demonstrate its scalability and generalizability, lightweight U-Net and SegFormer models were trained on annotated frames from a large and diverse multicenter dataset from 136 institutions, and then tested on a separate prospectively collected dataset. A web-platform was created to enable real-time inference on any surgical video stream, and performance was tested on and optimized for a range of network speeds. The U-Net and SegFormer models respectively achieved mean Dice scores of 57% and 60%, precision 45% and 53%, and recall 82% and 75% for predicting the Go zone, and mean Dice scores of 76% and 76%, precision 68% and 68%, and recall 92% and 92% for predicting the No-Go zone. After optimization of the client-server interaction over the network, we deliver a prediction stream of at least 60 fps and with a maximum round-trip delay of 70 ms for speeds above 8 Mbps. Clinical deployment of machine learning models for surgical guidance is feasible and cost-effective using a generalizable, scalable and equipment-agnostic framework that lacks dependency on hardware with high computing performance or ultra-fast internet connection speed.","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":null,"pages":null},"PeriodicalIF":12.4000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s41746-024-01225-2.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Digital Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.nature.com/articles/s41746-024-01225-2","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning for computer vision can be leveraged for interpreting surgical scenes and providing surgeons with real-time guidance to avoid complications. However, neither generalizability nor scalability of computer-vision-based surgical guidance systems have been demonstrated, especially to geographic locations that lack hardware and infrastructure necessary for real-time inference. We propose a new equipment-agnostic framework for real-time use in operating suites. Using laparoscopic cholecystectomy and semantic segmentation models for predicting safe/dangerous (“Go”/”No-Go”) zones of dissection as an example use case, this study aimed to develop and test the performance of a novel data pipeline linked to a web-platform that enables real-time deployment from any edge device. To test this infrastructure and demonstrate its scalability and generalizability, lightweight U-Net and SegFormer models were trained on annotated frames from a large and diverse multicenter dataset from 136 institutions, and then tested on a separate prospectively collected dataset. A web-platform was created to enable real-time inference on any surgical video stream, and performance was tested on and optimized for a range of network speeds. The U-Net and SegFormer models respectively achieved mean Dice scores of 57% and 60%, precision 45% and 53%, and recall 82% and 75% for predicting the Go zone, and mean Dice scores of 76% and 76%, precision 68% and 68%, and recall 92% and 92% for predicting the No-Go zone. After optimization of the client-server interaction over the network, we deliver a prediction stream of at least 60 fps and with a maximum round-trip delay of 70 ms for speeds above 8 Mbps. Clinical deployment of machine learning models for surgical guidance is feasible and cost-effective using a generalizable, scalable and equipment-agnostic framework that lacks dependency on hardware with high computing performance or ultra-fast internet connection speed.
期刊介绍:
npj Digital Medicine is an online open-access journal that focuses on publishing peer-reviewed research in the field of digital medicine. The journal covers various aspects of digital medicine, including the application and implementation of digital and mobile technologies in clinical settings, virtual healthcare, and the use of artificial intelligence and informatics.
The primary goal of the journal is to support innovation and the advancement of healthcare through the integration of new digital and mobile technologies. When determining if a manuscript is suitable for publication, the journal considers four important criteria: novelty, clinical relevance, scientific rigor, and digital innovation.