SF-SAM-Adapter: SAM-based segmentation model integrates prior knowledge for gaze image reflection noise removal

IF 6.2 2区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY alexandria engineering journal Pub Date : 2024-10-29 DOI:10.1016/j.aej.2024.10.092
Ting Lei, Jing Chen, Jixiang Chen
{"title":"SF-SAM-Adapter: SAM-based segmentation model integrates prior knowledge for gaze image reflection noise removal","authors":"Ting Lei,&nbsp;Jing Chen,&nbsp;Jixiang Chen","doi":"10.1016/j.aej.2024.10.092","DOIUrl":null,"url":null,"abstract":"<div><div>Gaze tracking technology in HMDs (Head-Mounted Displays) suffers from decreased accuracy due to highlight reflection noise from users' glasses. To address this, we present a denoising method which directly pinpoints the noisy regions through advanced segmentation models and then fills the flawed regions through advanced image inpainting algorithms. In segmentation stage, we introduce a novel model based on the recently proposed segmentation large model SAM (Segment Anything Model), called SF-SAM-Adapter (Spatial and Frequency aware SAM Adapter). It injects prior knowledge regarding the strip-like shaped in spatial and high-frequency in frequency of reflection noise into SAM by integrating specially designed trainable adapter modules into the original structure, while retaining the expressive power of the large model and better adapting to the downstream task. We achieved segmentation metrics of IoU (Intersection over Union) = 0.749 and Dice = 0.853 at a memory size of 13.9 MB, outperforming recent techniques, including UNet, UNet++, BATFormer, FANet, MSA, and SAM2-Adapter. In inpainting, we employ the advanced inpainting algorithm LAMA (Large Mask inpainting), resulting in significant improvements in gaze tracking accuracy by 0.502°, 0.182°, and 0.319° across three algorithms. The code and datasets used in current study are available in the repository: <span><span>https://github.com/leiting5297/SF-SAM-Adapter.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"111 ","pages":"Pages 521-529"},"PeriodicalIF":6.2000,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016824012572","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Gaze tracking technology in HMDs (Head-Mounted Displays) suffers from decreased accuracy due to highlight reflection noise from users' glasses. To address this, we present a denoising method which directly pinpoints the noisy regions through advanced segmentation models and then fills the flawed regions through advanced image inpainting algorithms. In segmentation stage, we introduce a novel model based on the recently proposed segmentation large model SAM (Segment Anything Model), called SF-SAM-Adapter (Spatial and Frequency aware SAM Adapter). It injects prior knowledge regarding the strip-like shaped in spatial and high-frequency in frequency of reflection noise into SAM by integrating specially designed trainable adapter modules into the original structure, while retaining the expressive power of the large model and better adapting to the downstream task. We achieved segmentation metrics of IoU (Intersection over Union) = 0.749 and Dice = 0.853 at a memory size of 13.9 MB, outperforming recent techniques, including UNet, UNet++, BATFormer, FANet, MSA, and SAM2-Adapter. In inpainting, we employ the advanced inpainting algorithm LAMA (Large Mask inpainting), resulting in significant improvements in gaze tracking accuracy by 0.502°, 0.182°, and 0.319° across three algorithms. The code and datasets used in current study are available in the repository: https://github.com/leiting5297/SF-SAM-Adapter.git.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SF-SAM 适配器:基于 SAM 的分割模型整合了先验知识,用于去除凝视图像反射噪声
HMD(头戴式显示器)中的凝视跟踪技术因用户眼镜的高亮反射噪声而降低了准确性。为了解决这个问题,我们提出了一种去噪方法,通过先进的分割模型直接定位噪声区域,然后通过先进的图像内绘算法填补有缺陷的区域。在分割阶段,我们引入了一种基于最近提出的大型分割模型 SAM(Segment Anything Model)的新型模型,称为 SF-SAM-Adapter(空间和频率感知 SAM 适配器)。它通过将专门设计的可训练适配器模块集成到原始结构中,为 SAM 注入了有关反射噪声的空间条状形状和频率高频的先验知识,同时保留了大型模型的表现力,并能更好地适应下游任务。在内存容量为 13.9 MB 的情况下,我们实现了 IoU(Intersection over Union)= 0.749 和 Dice = 0.853 的分割指标,优于包括 UNet、UNet++、BATFormer、FANet、MSA 和 SAM2-Adapter 在内的最新技术。在绘制方面,我们采用了先进的绘制算法 LAMA(大掩模绘制),三种算法的注视跟踪精度分别显著提高了 0.502°、0.182° 和 0.319°。当前研究中使用的代码和数据集可在以下资源库中获取:https://github.com/leiting5297/SF-SAM-Adapter.git。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
alexandria engineering journal
alexandria engineering journal Engineering-General Engineering
CiteScore
11.20
自引率
4.40%
发文量
1015
审稿时长
43 days
期刊介绍: Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification: • Mechanical, Production, Marine and Textile Engineering • Electrical Engineering, Computer Science and Nuclear Engineering • Civil and Architecture Engineering • Chemical Engineering and Applied Sciences • Environmental Engineering
期刊最新文献
Shuffle-PG: Lightweight feature extraction model for retrieving images of plant diseases and pests with deep metric learning Intelligence algorithm for the treatment of gastrointestinal diseases based on immune monitoring and neuroscience: A revolutionary tool for translational medicine Optimal compensation method for centrifugal impeller considering aerodynamic performance and dimensional accuracy Fractional-order PID feedback synthesis controller including some external influences on insulin and glucose monitoring IoT-based approach to multimodal music emotion recognition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1