SETA: Semantic-Aware Edge-Guided Token Augmentation for Domain Generalization

Jintao Guo;Lei Qi;Yinghuan Shi;Yang Gao
{"title":"SETA: Semantic-Aware Edge-Guided Token Augmentation for Domain Generalization","authors":"Jintao Guo;Lei Qi;Yinghuan Shi;Yang Gao","doi":"10.1109/TIP.2024.3470517","DOIUrl":null,"url":null,"abstract":"Domain generalization (DG) aims to enhance the model robustness against domain shifts without accessing target domains. A prevalent category of methods for DG is data augmentation, which focuses on generating virtual samples to simulate domain shifts. However, existing augmentation techniques in DG are mainly tailored for convolutional neural networks (CNNs), with limited exploration in token-based architectures, i.e., vision transformer (ViT) and multi-layer perceptrons (MLP) models. In this paper, we study the impact of prior CNN-based augmentation methods on token-based models, revealing their performance is suboptimal due to the lack of incentivizing the model to learn holistic shape information. To tackle the issue, we propose the Semantic-aware Edge-guided Token Augmentation (SETA) method. SETA transforms token features by perturbing local edge cues while preserving global shape features, thereby enhancing the model learning of shape information. To further enhance the generalization ability of the model, we introduce two stylized variants of our method combined with two state-of-the-art (SOTA) style augmentation methods in DG. We provide a theoretical insight into our method, demonstrating its effectiveness in reducing the generalization risk bound. Comprehensive experiments on five benchmarks prove that our method achieves SOTA performances across various ViT and MLP architectures. Our code is available at \n<uri>https://github.com/lingeringlight/SETA</uri>\n.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5622-5636"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10705912/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Domain generalization (DG) aims to enhance the model robustness against domain shifts without accessing target domains. A prevalent category of methods for DG is data augmentation, which focuses on generating virtual samples to simulate domain shifts. However, existing augmentation techniques in DG are mainly tailored for convolutional neural networks (CNNs), with limited exploration in token-based architectures, i.e., vision transformer (ViT) and multi-layer perceptrons (MLP) models. In this paper, we study the impact of prior CNN-based augmentation methods on token-based models, revealing their performance is suboptimal due to the lack of incentivizing the model to learn holistic shape information. To tackle the issue, we propose the Semantic-aware Edge-guided Token Augmentation (SETA) method. SETA transforms token features by perturbing local edge cues while preserving global shape features, thereby enhancing the model learning of shape information. To further enhance the generalization ability of the model, we introduce two stylized variants of our method combined with two state-of-the-art (SOTA) style augmentation methods in DG. We provide a theoretical insight into our method, demonstrating its effectiveness in reducing the generalization risk bound. Comprehensive experiments on five benchmarks prove that our method achieves SOTA performances across various ViT and MLP architectures. Our code is available at https://github.com/lingeringlight/SETA .
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SETA:面向领域泛化的语义感知边缘引导令牌增强。
领域泛化(DG)的目的是在不进入目标领域的情况下增强模型对领域变化的鲁棒性。数据增强是领域泛化的一类常用方法,其重点是生成虚拟样本来模拟领域变化。然而,DG 中现有的增强技术主要针对卷积神经网络(CNN),对基于标记的架构(即视觉转换器(ViT)和多层感知器(MLP)模型)的探索有限。在本文中,我们研究了之前基于 CNN 的增强方法对基于标记的模型的影响,发现由于缺乏激励模型学习整体形状信息,它们的性能并不理想。为了解决这个问题,我们提出了语义感知边缘引导标记增强(SETA)方法。SETA 通过扰动局部边缘线索来转换标记特征,同时保留整体形状特征,从而增强模型对形状信息的学习。为了进一步增强模型的泛化能力,我们在 DG 中引入了我们的方法的两个风格化变体,并结合了两种最先进的(SOTA)风格增强方法。我们从理论上深入分析了我们的方法,证明了它在降低泛化风险边界方面的有效性。在五个基准上进行的综合实验证明,我们的方法在各种 ViT 和 MLP 架构上都达到了 SOTA 性能。我们的代码见 https://github.com/lingeringlight/SETA。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
EvRepSL: Event-Stream Representation via Self-Supervised Learning for Event-Based Vision DeepDuoHDR: A Low Complexity Two Exposure Algorithm for HDR Deghosting on Mobile Devices AnySR: Realizing Image Super-Resolution as Any-Scale, Any-Resource Enhanced Multispectral Band-to-Band Registration Using Co-Occurrence Scale Space and Spatial Confined RANSAC Guided Segmented Affine Transformation Pro2Diff: Proposal Propagation for Multi-Object Tracking via the Diffusion Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1