{"title":"Full fine-tuning strategy for endoscopic foundation models with expanded learnable offset parameters.","authors":"Minghan Dong, Xiangwei Zheng, Xia Zhang, Xingyu Zhang, Mingzhe Zhang","doi":"10.1088/2057-1976/adaec3","DOIUrl":null,"url":null,"abstract":"<p><p>In the medical field, endoscopic video analysis is crucial for disease
diagnosis and minimally invasive surgery. The Endoscopic Foundation Models (Endo-
FM) utilize large-scale self-supervised pre-training on endoscopic video data and
leverage video transformer models to capture long-range spatiotemporal dependencies.
However, detecting complex lesions such as gastrointestinal metaplasia (GIM) in
endoscopic videos remains challenging due to unclear boundaries and indistinct
features, and Endo-FM has not demonstrated good performance. To this end, we
propose a fully fine-tuning strategy with an Extended Learnable Offset Parameter
(ELOP), which improves model performance by introducing learnable offset parameters
in the input space. Specifically, we propose a novel loss function that combines cross-
entropy loss and focal loss through a weighted sum, enabling the model to better focus
on hard-to-classify samples during training. We validated ELOP on a private GIM
dataset from a local grade-A tertiary hospital and a public polyp detection dataset.
Experimental results show that ELOP significantly improves the detection accuracy,
achieving accuracy improvements of 6.25 % and 3.75%respectively compared to the
original Endo-FM. In summary, ELOP provides an excellent solution for detecting
complex lesions in endoscopic videos, achieving more precise diagnoses.</p>","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Physics & Engineering Express","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2057-1976/adaec3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
In the medical field, endoscopic video analysis is crucial for disease
diagnosis and minimally invasive surgery. The Endoscopic Foundation Models (Endo-
FM) utilize large-scale self-supervised pre-training on endoscopic video data and
leverage video transformer models to capture long-range spatiotemporal dependencies.
However, detecting complex lesions such as gastrointestinal metaplasia (GIM) in
endoscopic videos remains challenging due to unclear boundaries and indistinct
features, and Endo-FM has not demonstrated good performance. To this end, we
propose a fully fine-tuning strategy with an Extended Learnable Offset Parameter
(ELOP), which improves model performance by introducing learnable offset parameters
in the input space. Specifically, we propose a novel loss function that combines cross-
entropy loss and focal loss through a weighted sum, enabling the model to better focus
on hard-to-classify samples during training. We validated ELOP on a private GIM
dataset from a local grade-A tertiary hospital and a public polyp detection dataset.
Experimental results show that ELOP significantly improves the detection accuracy,
achieving accuracy improvements of 6.25 % and 3.75%respectively compared to the
original Endo-FM. In summary, ELOP provides an excellent solution for detecting
complex lesions in endoscopic videos, achieving more precise diagnoses.
期刊介绍:
BPEX is an inclusive, international, multidisciplinary journal devoted to publishing new research on any application of physics and/or engineering in medicine and/or biology. Characterized by a broad geographical coverage and a fast-track peer-review process, relevant topics include all aspects of biophysics, medical physics and biomedical engineering. Papers that are almost entirely clinical or biological in their focus are not suitable. The journal has an emphasis on publishing interdisciplinary work and bringing research fields together, encompassing experimental, theoretical and computational work.