SF-SAM-Adapter: SAM-based segmentation model integrates prior knowledge for gaze image reflection noise removal

IF 6.2 2区工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY alexandria engineering journal Pub Date : 2024-10-29 DOI:10.1016/j.aej.2024.10.092

Ting Lei, Jing Chen, Jixiang Chen

{"title":"SF-SAM-Adapter: SAM-based segmentation model integrates prior knowledge for gaze image reflection noise removal","authors":"Ting Lei, Jing Chen, Jixiang Chen","doi":"10.1016/j.aej.2024.10.092","DOIUrl":null,"url":null,"abstract":"<div><div>Gaze tracking technology in HMDs (Head-Mounted Displays) suffers from decreased accuracy due to highlight reflection noise from users' glasses. To address this, we present a denoising method which directly pinpoints the noisy regions through advanced segmentation models and then fills the flawed regions through advanced image inpainting algorithms. In segmentation stage, we introduce a novel model based on the recently proposed segmentation large model SAM (Segment Anything Model), called SF-SAM-Adapter (Spatial and Frequency aware SAM Adapter). It injects prior knowledge regarding the strip-like shaped in spatial and high-frequency in frequency of reflection noise into SAM by integrating specially designed trainable adapter modules into the original structure, while retaining the expressive power of the large model and better adapting to the downstream task. We achieved segmentation metrics of IoU (Intersection over Union) = 0.749 and Dice = 0.853 at a memory size of 13.9 MB, outperforming recent techniques, including UNet, UNet++, BATFormer, FANet, MSA, and SAM2-Adapter. In inpainting, we employ the advanced inpainting algorithm LAMA (Large Mask inpainting), resulting in significant improvements in gaze tracking accuracy by 0.502°, 0.182°, and 0.319° across three algorithms. The code and datasets used in current study are available in the repository: <span><span>https://github.com/leiting5297/SF-SAM-Adapter.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"111 ","pages":"Pages 521-529"},"PeriodicalIF":6.2000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016824012572","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Gaze tracking technology in HMDs (Head-Mounted Displays) suffers from decreased accuracy due to highlight reflection noise from users' glasses. To address this, we present a denoising method which directly pinpoints the noisy regions through advanced segmentation models and then fills the flawed regions through advanced image inpainting algorithms. In segmentation stage, we introduce a novel model based on the recently proposed segmentation large model SAM (Segment Anything Model), called SF-SAM-Adapter (Spatial and Frequency aware SAM Adapter). It injects prior knowledge regarding the strip-like shaped in spatial and high-frequency in frequency of reflection noise into SAM by integrating specially designed trainable adapter modules into the original structure, while retaining the expressive power of the large model and better adapting to the downstream task. We achieved segmentation metrics of IoU (Intersection over Union) = 0.749 and Dice = 0.853 at a memory size of 13.9 MB, outperforming recent techniques, including UNet, UNet++, BATFormer, FANet, MSA, and SAM2-Adapter. In inpainting, we employ the advanced inpainting algorithm LAMA (Large Mask inpainting), resulting in significant improvements in gaze tracking accuracy by 0.502°, 0.182°, and 0.319° across three algorithms. The code and datasets used in current study are available in the repository: https://github.com/leiting5297/SF-SAM-Adapter.git.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SF-SAM 适配器：基于 SAM 的分割模型整合了先验知识，用于去除凝视图像反射噪声

HMD（头戴式显示器）中的凝视跟踪技术因用户眼镜的高亮反射噪声而降低了准确性。为了解决这个问题，我们提出了一种去噪方法，通过先进的分割模型直接定位噪声区域，然后通过先进的图像内绘算法填补有缺陷的区域。在分割阶段，我们引入了一种基于最近提出的大型分割模型 SAM（Segment Anything Model）的新型模型，称为 SF-SAM-Adapter（空间和频率感知 SAM 适配器）。它通过将专门设计的可训练适配器模块集成到原始结构中，为 SAM 注入了有关反射噪声的空间条状形状和频率高频的先验知识，同时保留了大型模型的表现力，并能更好地适应下游任务。在内存容量为 13.9 MB 的情况下，我们实现了 IoU（Intersection over Union）= 0.749 和 Dice = 0.853 的分割指标，优于包括 UNet、UNet++、BATFormer、FANet、MSA 和 SAM2-Adapter 在内的最新技术。在绘制方面，我们采用了先进的绘制算法 LAMA（大掩模绘制），三种算法的注视跟踪精度分别显著提高了 0.502°、0.182° 和 0.319°。当前研究中使用的代码和数据集可在以下资源库中获取：https://github.com/leiting5297/SF-SAM-Adapter.git。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

alexandria engineering journal Engineering-General Engineering

CiteScore

11.20

自引率

4.40%

发文量

1015

审稿时长

43 days

期刊介绍： Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification: • Mechanical, Production, Marine and Textile Engineering • Electrical Engineering, Computer Science and Nuclear Engineering • Civil and Architecture Engineering • Chemical Engineering and Applied Sciences • Environmental Engineering