{"title":"Mask-Shift-Inference: A novel paradigm for domain generalization","authors":"","doi":"10.1016/j.neunet.2024.106629","DOIUrl":null,"url":null,"abstract":"<div><p>Domain Generalization (DG) focuses on the Out-Of-Distribution (OOD) generalization, which is able to learn a robust model that generalizes the knowledge acquired from the source domain to the unseen target domain. However, due to the existence of the domain shift, domain-invariant representation learning is challenging. Guided by fine-grained knowledge, we propose a novel paradigm Mask-Shift-Inference (MSI) for DG based on the architecture of Convolutional Neural Networks (CNN). Different from relying on a series of constraints and assumptions for model optimization, this paradigm novelly shifts the focus to feature channels in the latent space for domain-invariant representation learning. We put forward a two-branch working mode of a main module and multiple domain-specific sub-modules. The latter can only achieve good prediction performance in its own specific domain but poor predictions in other source domains, which provides the main module with the fine-grained knowledge guidance and contributes to the improvement of the cognitive ability of MSI. Firstly, during the forward propagation of the main module, the proposed MSI accurately discards unstable channels based on spurious classifications varying across domains, which have domain-specific prediction limitations and are not conducive to generalization. In this process, a progressive scheme is adopted to adaptively increase the masking ratio according to the training progress to further reduce the risk of overfitting. Subsequently, our paradigm enters the compatible shifting stage before the formal prediction. Based on maximizing semantic retention, we implement the domain style matching and shifting through the simple transformation through Fourier transform, which can explicitly and safely shift the target domain back to the source domain whose style is closest to it, requiring no additional model updates and reducing the domain gap. Eventually, the paradigm MSI enters the formal inference stage. The updated target domain is predicted in the main module trained in the previous stage with the benefit of familiar knowledge from the nearest source domain masking scheme. Our paradigm is logically progressive, which can intuitively exclude the confounding influence of domain-specific spurious information along with mitigating domain shifts and implicitly perform semantically invariant representation learning, achieving robust OOD generalization. Extensive experimental results on PACS, VLCS, Office-Home and DomainNet datasets verify the superiority and effectiveness of the proposed method.</p></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":null,"pages":null},"PeriodicalIF":6.0000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608024005537","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Domain Generalization (DG) focuses on the Out-Of-Distribution (OOD) generalization, which is able to learn a robust model that generalizes the knowledge acquired from the source domain to the unseen target domain. However, due to the existence of the domain shift, domain-invariant representation learning is challenging. Guided by fine-grained knowledge, we propose a novel paradigm Mask-Shift-Inference (MSI) for DG based on the architecture of Convolutional Neural Networks (CNN). Different from relying on a series of constraints and assumptions for model optimization, this paradigm novelly shifts the focus to feature channels in the latent space for domain-invariant representation learning. We put forward a two-branch working mode of a main module and multiple domain-specific sub-modules. The latter can only achieve good prediction performance in its own specific domain but poor predictions in other source domains, which provides the main module with the fine-grained knowledge guidance and contributes to the improvement of the cognitive ability of MSI. Firstly, during the forward propagation of the main module, the proposed MSI accurately discards unstable channels based on spurious classifications varying across domains, which have domain-specific prediction limitations and are not conducive to generalization. In this process, a progressive scheme is adopted to adaptively increase the masking ratio according to the training progress to further reduce the risk of overfitting. Subsequently, our paradigm enters the compatible shifting stage before the formal prediction. Based on maximizing semantic retention, we implement the domain style matching and shifting through the simple transformation through Fourier transform, which can explicitly and safely shift the target domain back to the source domain whose style is closest to it, requiring no additional model updates and reducing the domain gap. Eventually, the paradigm MSI enters the formal inference stage. The updated target domain is predicted in the main module trained in the previous stage with the benefit of familiar knowledge from the nearest source domain masking scheme. Our paradigm is logically progressive, which can intuitively exclude the confounding influence of domain-specific spurious information along with mitigating domain shifts and implicitly perform semantically invariant representation learning, achieving robust OOD generalization. Extensive experimental results on PACS, VLCS, Office-Home and DomainNet datasets verify the superiority and effectiveness of the proposed method.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.