{"title":"Surgical phase classification and operative skill assessment through spatial context aware CNNs and time-invariant feature extracting autoencoders","authors":"Chakka Sai Pradeep, Neelam Sinha","doi":"10.1016/j.bbe.2023.10.001","DOIUrl":null,"url":null,"abstract":"<div><p>Automated surgical video analysis promises improved healthcare. We propose novel spatial context aware combined loss function for end-to-end Encoder-Decoder training for Surgical Phase Classification (SPC) on laparoscopic cholecystectomy (LC) videos. Proposed loss function leverages on fine-grained class activation maps obtained from fused multi-layer Layer-CAM for supervised learning of SPC, obtaining improved Layer-CAM explanations. Post classification, we introduce graph theory to incorporate known hierarchies of surgical phases. We report peak SPC accuracy of 96.16%, precision of 94.08% and recall of 90.02% on public dataset Cholec80, with 7 phases. Our proposed method utilizes just 73.5% of parameters as against existing state-of-the-art methodology, achieving improvement of 0.5% in accuracy, 1.76% in precision with comparable recall, with an order less standard deviation. We also propose DNN based surgical skill assessment methodology. This approach utilizes surgical phase prediction scores from the final fully-connected layer of spatial-context aware classifier to form multi-channel temporal signal of surgical phases. Time-invariant representation is obtained from this temporal signal through time- and frequency-domain analyses. Autoencoder based time-invariant features are utilized for reconstruction and identification of prominent peaks in dissimilarity curves. We devise a surgical skill measure (SSM) based on spatial-context aware temporal-prominence-of-peaks curve. SSM values are expected to be high when executed skillfully, aligning with expert assessed GOALS metric. We illustrate this trend on Cholec80 and m2cai16-tool datasets, in comparison with GOALS metric. Concurrence in the trend of SSM with respect to GOALS metric is obtained on these test videos, making it a promising step towards automated surgical skill assessment.</p></div>","PeriodicalId":55381,"journal":{"name":"Biocybernetics and Biomedical Engineering","volume":"43 4","pages":"Pages 700-724"},"PeriodicalIF":5.3000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biocybernetics and Biomedical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0208521623000554","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Automated surgical video analysis promises improved healthcare. We propose novel spatial context aware combined loss function for end-to-end Encoder-Decoder training for Surgical Phase Classification (SPC) on laparoscopic cholecystectomy (LC) videos. Proposed loss function leverages on fine-grained class activation maps obtained from fused multi-layer Layer-CAM for supervised learning of SPC, obtaining improved Layer-CAM explanations. Post classification, we introduce graph theory to incorporate known hierarchies of surgical phases. We report peak SPC accuracy of 96.16%, precision of 94.08% and recall of 90.02% on public dataset Cholec80, with 7 phases. Our proposed method utilizes just 73.5% of parameters as against existing state-of-the-art methodology, achieving improvement of 0.5% in accuracy, 1.76% in precision with comparable recall, with an order less standard deviation. We also propose DNN based surgical skill assessment methodology. This approach utilizes surgical phase prediction scores from the final fully-connected layer of spatial-context aware classifier to form multi-channel temporal signal of surgical phases. Time-invariant representation is obtained from this temporal signal through time- and frequency-domain analyses. Autoencoder based time-invariant features are utilized for reconstruction and identification of prominent peaks in dissimilarity curves. We devise a surgical skill measure (SSM) based on spatial-context aware temporal-prominence-of-peaks curve. SSM values are expected to be high when executed skillfully, aligning with expert assessed GOALS metric. We illustrate this trend on Cholec80 and m2cai16-tool datasets, in comparison with GOALS metric. Concurrence in the trend of SSM with respect to GOALS metric is obtained on these test videos, making it a promising step towards automated surgical skill assessment.
期刊介绍:
Biocybernetics and Biomedical Engineering is a quarterly journal, founded in 1981, devoted to publishing the results of original, innovative and creative research investigations in the field of Biocybernetics and biomedical engineering, which bridges mathematical, physical, chemical and engineering methods and technology to analyse physiological processes in living organisms as well as to develop methods, devices and systems used in biology and medicine, mainly in medical diagnosis, monitoring systems and therapy. The Journal''s mission is to advance scientific discovery into new or improved standards of care, and promotion a wide-ranging exchange between science and its application to humans.