{"title":"用于图像情感回归的改进型多注意神经网络和 CAPS 的初步引入","authors":"Rending Wang, Dongmei Ma","doi":"10.54097/92w2rc31","DOIUrl":null,"url":null,"abstract":"Image sentiment analysis is a large class of tasks for classifying or regressing images containing emotional stimuli, and it is believed in psychological research that different groups produce different emotions for the same stimuli. In order to study the influence of cultural background on image sentiment analysis, it is necessary to introduce a dataset of image sentiment stimuli that can represent cultural groups. In this paper, we introduce the Chinese Affective Picture System (CAPS), which represents Chinese culture, and revise and test this dataset. The PDANet model has the best performance among the current image sentiment regression models, but due to the difficulty of extracting cross-channel information from the attention module it uses, image long-distance information correlation and other shortcomings, this paper proposes an image emotion regression multiple attention networks, introduces the SimAM attention mechanism, and improves the loss function to make it more consistent with the psychological theory, and proposes a 10-fold cross-validation for CAPS. The network achieves MSE=0.0188, R2=0.359 on IAPS, and MSE=0.0169, R2=0.463 on NAPS, which is better than PDANet; the best training result of CAPS is MSE=0.0083, R2=0.625, and the paired-sample t-test of the results shows that all the three dimensions are significantly positively correlated, with correlation coefficients r=0.942, 0.895 and 0.943, respectively, showing good internal consistency and excellent application prospect of CAPS.","PeriodicalId":504530,"journal":{"name":"Frontiers in Computing and Intelligent Systems","volume":" 32","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Multi-attention Neural Networks for Image Emotion Regression and the Initial Introduction of CAPS\",\"authors\":\"Rending Wang, Dongmei Ma\",\"doi\":\"10.54097/92w2rc31\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image sentiment analysis is a large class of tasks for classifying or regressing images containing emotional stimuli, and it is believed in psychological research that different groups produce different emotions for the same stimuli. In order to study the influence of cultural background on image sentiment analysis, it is necessary to introduce a dataset of image sentiment stimuli that can represent cultural groups. In this paper, we introduce the Chinese Affective Picture System (CAPS), which represents Chinese culture, and revise and test this dataset. The PDANet model has the best performance among the current image sentiment regression models, but due to the difficulty of extracting cross-channel information from the attention module it uses, image long-distance information correlation and other shortcomings, this paper proposes an image emotion regression multiple attention networks, introduces the SimAM attention mechanism, and improves the loss function to make it more consistent with the psychological theory, and proposes a 10-fold cross-validation for CAPS. The network achieves MSE=0.0188, R2=0.359 on IAPS, and MSE=0.0169, R2=0.463 on NAPS, which is better than PDANet; the best training result of CAPS is MSE=0.0083, R2=0.625, and the paired-sample t-test of the results shows that all the three dimensions are significantly positively correlated, with correlation coefficients r=0.942, 0.895 and 0.943, respectively, showing good internal consistency and excellent application prospect of CAPS.\",\"PeriodicalId\":504530,\"journal\":{\"name\":\"Frontiers in Computing and Intelligent Systems\",\"volume\":\" 32\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Computing and Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54097/92w2rc31\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computing and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54097/92w2rc31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improved Multi-attention Neural Networks for Image Emotion Regression and the Initial Introduction of CAPS
Image sentiment analysis is a large class of tasks for classifying or regressing images containing emotional stimuli, and it is believed in psychological research that different groups produce different emotions for the same stimuli. In order to study the influence of cultural background on image sentiment analysis, it is necessary to introduce a dataset of image sentiment stimuli that can represent cultural groups. In this paper, we introduce the Chinese Affective Picture System (CAPS), which represents Chinese culture, and revise and test this dataset. The PDANet model has the best performance among the current image sentiment regression models, but due to the difficulty of extracting cross-channel information from the attention module it uses, image long-distance information correlation and other shortcomings, this paper proposes an image emotion regression multiple attention networks, introduces the SimAM attention mechanism, and improves the loss function to make it more consistent with the psychological theory, and proposes a 10-fold cross-validation for CAPS. The network achieves MSE=0.0188, R2=0.359 on IAPS, and MSE=0.0169, R2=0.463 on NAPS, which is better than PDANet; the best training result of CAPS is MSE=0.0083, R2=0.625, and the paired-sample t-test of the results shows that all the three dimensions are significantly positively correlated, with correlation coefficients r=0.942, 0.895 and 0.943, respectively, showing good internal consistency and excellent application prospect of CAPS.