{"title":"Causal softmax for out-of-distribution generalization","authors":"Jing Luo, Wanqing Zhao, Jinye Peng","doi":"10.1016/j.dsp.2024.104861","DOIUrl":null,"url":null,"abstract":"<div><div>Most supervised learning algorithms follow the Empirical Risk Minimization (ERM) principle, which assumes that training and test data are independently and identically distributed (IID). However, when faced with out-of-distribution (OOD) data, these models may inadvertently learn spurious correlations introduced by confounding factors in the training data. This can result in suboptimal performance on the test data, ultimately compromising the model's practical reliability. In this paper, we propose a novel causal softmax algorithm to address this challenge. First, we introduce a method to define causal and non-causal features in image classification tasks. Then, by employing a causal feature discovery module, we analyze high-level semantic activations extracted by the feature extraction network to distinguish between causal and non-causal features. Subsequently, we penalize the weights associated with non-causal features in the classifier to mitigate their influence, enabling the classifier to establish associations solely based on causal features and labels. Extensive experiments on public datasets like NICO and ImageNet-9 demonstrate the superiority of our approach.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"156 ","pages":"Article 104861"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200424004858","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Most supervised learning algorithms follow the Empirical Risk Minimization (ERM) principle, which assumes that training and test data are independently and identically distributed (IID). However, when faced with out-of-distribution (OOD) data, these models may inadvertently learn spurious correlations introduced by confounding factors in the training data. This can result in suboptimal performance on the test data, ultimately compromising the model's practical reliability. In this paper, we propose a novel causal softmax algorithm to address this challenge. First, we introduce a method to define causal and non-causal features in image classification tasks. Then, by employing a causal feature discovery module, we analyze high-level semantic activations extracted by the feature extraction network to distinguish between causal and non-causal features. Subsequently, we penalize the weights associated with non-causal features in the classifier to mitigate their influence, enabling the classifier to establish associations solely based on causal features and labels. Extensive experiments on public datasets like NICO and ImageNet-9 demonstrate the superiority of our approach.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,