Improving Self-Supervised Medical Image Pre-Training by Early Alignment With Human Eye Gaze Information

Sheng Wang;Zihao Zhao;Zhenrong Shen;Bin Wang;Qian Wang;Dinggang Shen
{"title":"Improving Self-Supervised Medical Image Pre-Training by Early Alignment With Human Eye Gaze Information","authors":"Sheng Wang;Zihao Zhao;Zhenrong Shen;Bin Wang;Qian Wang;Dinggang Shen","doi":"10.1109/TMI.2025.3528965","DOIUrl":null,"url":null,"abstract":"Alignment between human knowledge and machine learning models is crucial for achieving efficient and interpretable AI systems. However, conventional self-supervised pre-training methods often suffer from low efficiency, as they do not incorporate human knowledge during the pre-training process and instead rely mainly on post-hoc alignment techniques. We propose Gaze Pre-Training (GzPT), a novel approach that introduces early alignment with human eye gaze information during the pre-training process to enhance both the learning efficiency and performance of self-supervised models. By leveraging contrastive learning to pull together images with similar gaze patterns, GzPT can effectively align the model with human attention during the pre-training. We demonstrate the effectiveness of our approach on three diverse medical image datasets, showing that GzPT can consistently outperform baseline methods and learn more meaningful and interpretable representations. Our findings also highlight the potential of incorporating human eye gaze as a form of passive knowledge to bridge the gap between human and machine learning in the self-supervised pre-training. Our code is available at Github.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 10","pages":"4063-4072"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10839445/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Alignment between human knowledge and machine learning models is crucial for achieving efficient and interpretable AI systems. However, conventional self-supervised pre-training methods often suffer from low efficiency, as they do not incorporate human knowledge during the pre-training process and instead rely mainly on post-hoc alignment techniques. We propose Gaze Pre-Training (GzPT), a novel approach that introduces early alignment with human eye gaze information during the pre-training process to enhance both the learning efficiency and performance of self-supervised models. By leveraging contrastive learning to pull together images with similar gaze patterns, GzPT can effectively align the model with human attention during the pre-training. We demonstrate the effectiveness of our approach on three diverse medical image datasets, showing that GzPT can consistently outperform baseline methods and learn more meaningful and interpretable representations. Our findings also highlight the potential of incorporating human eye gaze as a form of passive knowledge to bridge the gap between human and machine learning in the self-supervised pre-training. Our code is available at Github.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过与人类眼球注视信息的早期对齐改进自我监督医学图像预训练
人类知识和机器学习模型之间的一致性对于实现高效和可解释的人工智能系统至关重要。然而,传统的自监督预训练方法往往效率较低,因为它们在预训练过程中没有纳入人类知识,而是主要依赖于事后对齐技术。我们提出了一种新的方法——凝视预训练(Gaze Pre-Training, GzPT),该方法在预训练过程中引入与人眼凝视信息的早期对齐,以提高自监督模型的学习效率和性能。通过利用对比学习将具有相似凝视模式的图像拉到一起,GzPT可以在预训练期间有效地将模型与人类注意力对齐。我们在三种不同的医学图像数据集上展示了我们的方法的有效性,表明GzPT可以始终优于基线方法,并学习更有意义和可解释的表示。我们的研究结果还强调了将人眼注视作为被动知识的一种形式,在自我监督的预训练中弥合人类和机器学习之间的差距的潜力。我们的代码可以在Github上找到。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Interpretable Multimodal Learning for Cardiovascular Hemodynamics Assessment. Observer-usable Information as a Task-specific Image Quality Metric. A 3D Cross-modal Keypoint Descriptor for MR-US Matching and Registration. FairFedMed: Benchmarking Group Fairness in Federated Medical Imaging With FairLoRA. MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1