{"title":"Improving Self-Supervised Medical Image Pre-Training by Early Alignment With Human Eye Gaze Information","authors":"Sheng Wang;Zihao Zhao;Zhenrong Shen;Bin Wang;Qian Wang;Dinggang Shen","doi":"10.1109/TMI.2025.3528965","DOIUrl":null,"url":null,"abstract":"Alignment between human knowledge and machine learning models is crucial for achieving efficient and interpretable AI systems. However, conventional self-supervised pre-training methods often suffer from low efficiency, as they do not incorporate human knowledge during the pre-training process and instead rely mainly on post-hoc alignment techniques. We propose Gaze Pre-Training (GzPT), a novel approach that introduces early alignment with human eye gaze information during the pre-training process to enhance both the learning efficiency and performance of self-supervised models. By leveraging contrastive learning to pull together images with similar gaze patterns, GzPT can effectively align the model with human attention during the pre-training. We demonstrate the effectiveness of our approach on three diverse medical image datasets, showing that GzPT can consistently outperform baseline methods and learn more meaningful and interpretable representations. Our findings also highlight the potential of incorporating human eye gaze as a form of passive knowledge to bridge the gap between human and machine learning in the self-supervised pre-training. Our code is available at Github.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 10","pages":"4063-4072"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10839445/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Alignment between human knowledge and machine learning models is crucial for achieving efficient and interpretable AI systems. However, conventional self-supervised pre-training methods often suffer from low efficiency, as they do not incorporate human knowledge during the pre-training process and instead rely mainly on post-hoc alignment techniques. We propose Gaze Pre-Training (GzPT), a novel approach that introduces early alignment with human eye gaze information during the pre-training process to enhance both the learning efficiency and performance of self-supervised models. By leveraging contrastive learning to pull together images with similar gaze patterns, GzPT can effectively align the model with human attention during the pre-training. We demonstrate the effectiveness of our approach on three diverse medical image datasets, showing that GzPT can consistently outperform baseline methods and learn more meaningful and interpretable representations. Our findings also highlight the potential of incorporating human eye gaze as a form of passive knowledge to bridge the gap between human and machine learning in the self-supervised pre-training. Our code is available at Github.