{"title":"SPRMamba: Surgical Phase Recognition for Endoscopic Submucosal Dissection with Mamba","authors":"Xiangning Zhang, Jinnan Chen, Qingwei Zhang, Chengfeng Zhou, Zhengjie Zhang, Xiaobo Li, Dahong Qian","doi":"arxiv-2409.12108","DOIUrl":null,"url":null,"abstract":"Endoscopic Submucosal Dissection (ESD) is a minimally invasive procedure\ninitially designed for the treatment of early gastric cancer but is now widely\nused for various gastrointestinal lesions. Computer-assisted Surgery systems\nhave played a crucial role in improving the precision and safety of ESD\nprocedures, however, their effectiveness is limited by the accurate recognition\nof surgical phases. The intricate nature of ESD, with different lesion\ncharacteristics and tissue structures, presents challenges for real-time\nsurgical phase recognition algorithms. Existing surgical phase recognition\nalgorithms struggle to efficiently capture temporal contexts in video-based\nscenarios, leading to insufficient performance. To address these issues, we\npropose SPRMamba, a novel Mamba-based framework for ESD surgical phase\nrecognition. SPRMamba leverages the strengths of Mamba for long-term temporal\nmodeling while introducing the Scaled Residual TranMamba block to enhance the\ncapture of fine-grained details, overcoming the limitations of traditional\ntemporal models like Temporal Convolutional Networks and Transformers.\nMoreover, a Temporal Sample Strategy is introduced to accelerate the\nprocessing, which is essential for real-time phase recognition in clinical\nsettings. Extensive testing on the ESD385 dataset and the cholecystectomy\nCholec80 dataset demonstrates that SPRMamba surpasses existing state-of-the-art\nmethods and exhibits greater robustness across various surgical phase\nrecognition tasks.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":"155 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.12108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Endoscopic Submucosal Dissection (ESD) is a minimally invasive procedure
initially designed for the treatment of early gastric cancer but is now widely
used for various gastrointestinal lesions. Computer-assisted Surgery systems
have played a crucial role in improving the precision and safety of ESD
procedures, however, their effectiveness is limited by the accurate recognition
of surgical phases. The intricate nature of ESD, with different lesion
characteristics and tissue structures, presents challenges for real-time
surgical phase recognition algorithms. Existing surgical phase recognition
algorithms struggle to efficiently capture temporal contexts in video-based
scenarios, leading to insufficient performance. To address these issues, we
propose SPRMamba, a novel Mamba-based framework for ESD surgical phase
recognition. SPRMamba leverages the strengths of Mamba for long-term temporal
modeling while introducing the Scaled Residual TranMamba block to enhance the
capture of fine-grained details, overcoming the limitations of traditional
temporal models like Temporal Convolutional Networks and Transformers.
Moreover, a Temporal Sample Strategy is introduced to accelerate the
processing, which is essential for real-time phase recognition in clinical
settings. Extensive testing on the ESD385 dataset and the cholecystectomy
Cholec80 dataset demonstrates that SPRMamba surpasses existing state-of-the-art
methods and exhibits greater robustness across various surgical phase
recognition tasks.